Motivation: Reverse phase protein array (RPPA) is a powerful dot-blot technology that allows studying protein expression levels as well as post-translational modifications in a large number of samples simultaneously. Yet, correct interpretation of RPPA data has remained a major challenge for its broad-scale application and its translation into clinical research. Satisfying quantification tools are available to assess a relative protein expression level from a serial dilution curve. However, appropriate tools allowing the normalization of the data for external sources of variation are currently missing.
Results: Here we propose a new method, called NormaCurve, that allows simultaneous quantification and normalization of RPPA data. For this, we modified the quantification method SuperCurve in order to include normalization for (i) background fluorescence, (ii) variation in the total amount of spotted protein and (iii) spatial bias on the arrays. Using a spike-in design with a purified protein, we test the capacity of different models to properly estimate normalized relative expression levels. The best performing model, NormaCurve, takes into account a negative control array without primary antibody, an array stained with a total protein stain and spatial covariates. We show that this normalization is reproducible and we discuss the number of serial dilutions and the number of replicates that are required to obtain robust data. We thus provide a ready-to-use method for reliable and reproducible normalization of RPPA data, which should facilitate the interpretation and the development of this promising technology.
Availability: The raw data, the scripts and the normacurve package are available at the following web site: http://microarrays.curie.fr.