Robust-linear-model normalization to reduce technical variability in functional protein microarrays

J Proteome Res. 2009 Dec;8(12):5451-64. doi: 10.1021/pr900412k.


Protein microarrays are similar to DNA microarrays; both enabling the parallel interrogation of thousands of probes immobilized on a surface. Consequently, they have benefited from technologies previously developed for DNA microarrays. However, assumptions for the analysis of DNA microarrays do not always translate to protein arrays, especially in the case of normalization. Hence, we have developed an experimental and computational framework to assess normalization procedures for protein microarrays. Specifically, we profiled two sera with markedly different autoantibody compositions. To analyze intra- and interarray variability, we compared a set of control proteins across subarrays and the corresponding spots across multiple arrays, respectively. To estimate the degree to which the normalization could help reveal true biological separability, we tested the difference in the signal between the sera relative to the variability within replicates. Next, by mixing the sera in different proportions (titrations), we correlated the reactivity of proteins with serum concentration. Finally, we analyzed the effect of normalization procedures on the list of reactive proteins. We compared global and quantile normalization, techniques that have traditionally been employed for DNA microarrays, with a novel normalization approach based on a robust linear model (RLM) making explicit use of control proteins. We show that RLM normalization is able to reduce both intra- and interarray technical variability while maintaining biological differences. Moreover, in titration experiments, RLM normalization enhances the correlation of protein signals with serum concentration. Conversely, while quantile and global normalization can reduce interarray technical variability, neither is as effective as RLM normalization in maintaining biological differences. Most importantly, both introduce artifacts that distort the signals and affect the correct identification of reactive proteins, impairing their use for biomarker discovery. Hence, we show RLM normalization is better suited to protein arrays than approaches used for DNA microarrays.

MeSH terms

  • Autoantibodies / blood*
  • Humans
  • Linear Models*
  • Models, Statistical
  • Normal Distribution
  • Protein Array Analysis / statistics & numerical data*


  • Autoantibodies