On the necessity of different statistical treatment for Illumina BeadChip and Affymetrix GeneChip data and its significance for biological interpretation

Biol Direct. 2008 Jun 3:3:23. doi: 10.1186/1745-6150-3-23.

Abstract

Background: The original spotted array technology with competitive hybridization of two experimental samples and measuring relative expression levels is increasingly displaced by more accurate platforms that allow determining absolute expression values for a single sample (for example, Affymetrix GeneChip and Illumina BeadChip). Unfortunately, cross-platform comparisons show a disappointingly low concordance between lists of regulated genes between the latter two platforms.

Results: Whereas expression values determined with a single Affymetrix GeneChip represent single measurements, the expression results obtained with Illumina BeadChip are essentially statistical means from several dozens of identical probes. In the case of multiple technical replicates, the data require, therefore, different stistical treatment depending on the platform. The key is the computation of the squared standard deviation within replicates in the case of the Illumina data as weighted mean of the square of the standard deviations of the individual experiments. With an Illumina spike experiment, we demonstrate dramatically improved significance of spiked genes over all relevant concentration ranges. The re-evaluation of two published Illumina datasets (membrane type-1 matrix metalloproteinase expression in mammary epithelial cells by Golubkov et al. Cancer Research (2006) 66, 10460; spermatogenesis in normal and teratozoospermic men, Platts et al. Human Molecular Genetics (2007) 16, 763) significantly identified more biologically relevant genes as transcriptionally regulated targets and, thus, additional biological pathways involved.

Conclusion: The results in this work show that it is important to process Illumina BeadChip data in a modified statistical procedure and to compute the standard deviation in experiments with technical replicates from the standard errors of individual BeadChips. This change leads also to an improved concordance with Affymetrix GeneChip results as the spermatogenesis dataset re-evaluation demonstrates.

Reviewers: This article was reviewed by I. King Jordan, Mark J. Dunning and Shamil Sunyaev.

Publication types

  • Comparative Study

MeSH terms

  • Animals
  • Computer Simulation
  • Gene Expression Profiling / methods
  • Gene Expression Profiling / statistics & numerical data*
  • Humans
  • Male
  • Mice
  • Models, Genetic
  • Oligonucleotide Array Sequence Analysis / methods
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data*
  • Polysaccharides / biosynthesis
  • Signal Transduction / genetics
  • Spermatozoa / enzymology
  • Spermatozoa / metabolism

Substances

  • Polysaccharides