Sample size determination in microarray experiments for class comparison and prognostic classification

Biostatistics. 2005 Jan;6(1):27-38. doi: 10.1093/biostatistics/kxh015.


Determining sample sizes for microarray experiments is important but the complexity of these experiments, and the large amounts of data they produce, can make the sample size issue seem daunting, and tempt researchers to use rules of thumb in place of formal calculations based on the goals of the experiment. Here we present formulae for determining sample sizes to achieve a variety of experimental goals, including class comparison and the development of prognostic markers. Results are derived which describe the impact of pooling, technical replicates and dye-swap arrays on sample size requirements. These results are shown to depend on the relative sizes of different sources of variability. A variety of common types of experimental situations and designs used with single-label and dual-label microarrays are considered. We discuss procedures for controlling the false discovery rate. Our calculations are based on relatively simple yet realistic statistical models for the data, and provide straightforward sample size calculation formulae.

MeSH terms

  • Biomarkers
  • False Positive Reactions
  • Humans
  • Linear Models*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Predictive Value of Tests
  • Prognosis
  • Sample Size*


  • Biomarkers