Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies

Nat Genet. 2006 Feb;38(2):209-13. doi: 10.1038/ng1706. Epub 2006 Jan 15.


Genome-wide association is a promising approach to identify common genetic variants that predispose to human disease. Because of the high cost of genotyping hundreds of thousands of markers on thousands of subjects, genome-wide association studies often follow a staged design in which a proportion (pi(samples)) of the available samples are genotyped on a large number of markers in stage 1, and a proportion (pi(samples)) of these markers are later followed up by genotyping them on the remaining samples in stage 2. The standard strategy for analyzing such two-stage data is to view stage 2 as a replication study and focus on findings that reach statistical significance when stage 2 data are considered alone. We demonstrate that the alternative strategy of jointly analyzing the data from both stages almost always results in increased power to detect genetic association, despite the need to use more stringent significance levels, even when effect sizes differ between the two stages. We recommend joint analysis for all two-stage genome-wide association studies, especially when a relatively large proportion of the samples are genotyped in stage 1 (pi(samples) >or= 0.30), and a relatively large proportion of markers are selected for follow-up in stage 2 (pi(markers) >or= 0.01).

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Alleles
  • Case-Control Studies
  • DNA Replication / genetics*
  • Gene Frequency / genetics
  • Genetic Heterogeneity
  • Genetic Markers / genetics
  • Genetic Predisposition to Disease / genetics*
  • Genetics, Medical / methods*
  • Genome, Human / genetics*
  • Genotype
  • Humans


  • Genetic Markers