A practical false discovery rate approach to identifying patterns of differential expression in microarray data

Bioinformatics. 2005 Jun 1;21(11):2684-90. doi: 10.1093/bioinformatics/bti407. Epub 2005 Mar 29.


Searching for differentially expressed genes is one of the most common applications for microarrays, yet statistically there are difficult hurdles to achieving adequate rigor and practicality. False discovery rate (FDR) approaches have become relatively standard; however, how to define and control the FDR has been hotly debated. Permutation estimation approaches such as SAM and PaGE can be effective; however, they leave much room for improvement. We pursue the permutation estimation method and describe a convenient definition for the FDR that can be estimated in a straightforward manner. We then discuss issues regarding the choice of statistic and data transformation. It is impossible to optimize the power of any statistic for thousands of genes simultaneously, and we look at the practical consequences of this. For example, the log transform can both help and hurt at the same time, depending on the gene. We examine issues surrounding the SAM 'fudge factor' parameter, and how to handle these issues by optimizing with respect to power.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms*
  • Computer Simulation
  • False Positive Reactions
  • Gene Expression Profiling / methods*
  • Models, Genetic*
  • Models, Statistical
  • Oligonucleotide Array Sequence Analysis / methods*
  • Pattern Recognition, Automated / methods*