Repeatability of published microarray gene expression analyses

Nat Genet. 2009 Feb;41(2):149-55. doi: 10.1038/ng.295. Epub 2008 Jan 28.


Given the complexity of microarray-based gene expression studies, guidelines encourage transparent design and public data availability. Several journals require public data deposition and several public databases exist. However, not all data are publicly available, and even when available, it is unknown whether the published results are reproducible by independent scientists. Here we evaluated the replication of data analyses in 18 articles on microarray-based gene expression profiling published in Nature Genetics in 2005-2006. One table or figure from each article was independently evaluated by two teams of analysts. We reproduced two analyses in principle and six partially or with some discrepancies; ten could not be reproduced. The main reason for failure to reproduce was data unavailability, and discrepancies were mostly due to incomplete data annotation or specification of data processing and analysis. Repeatability of published microarray studies is apparently limited. More strict publication rules enforcing public data availability and explicit description of data processing and analysis should be considered.

Publication types

  • Evaluation Study

MeSH terms

  • Animals
  • Data Interpretation, Statistical
  • Databases, Genetic*
  • Gene Expression Profiling / standards*
  • Genome-Wide Association Study / standards
  • Humans
  • Oligonucleotide Array Sequence Analysis / standards*
  • Peer Review, Research*
  • Publications / standards
  • Reproducibility of Results