Probability that a two-stage genome-wide association study will detect a disease-associated snp and implications for multistage designs

Ann Hum Genet. 2008 Nov;72(Pt 6):812-20. doi: 10.1111/j.1469-1809.2008.00467.x. Epub 2008 Jul 24.

Abstract

Large two-stage genome-wide association studies (GWASs) have been shown to reduce required genotyping with little loss of power, compared to a one-stage design, provided a substantial fraction of cases and controls, pi(sample), is included in stage 1. However, a number of recent GWASs have used pi(sample) < 0.2. Moreover, standard power calculations are not applicable because SNPs are selected in stage 1 by ranking their p-values, rather than comparing each SNP's statistic to a fixed critical value. We define the detection probability (DP) of a two-stage design as the probability that a given disease-associated SNP will have a p-value among the lowest ranks of p-values at stage 1, and, among those SNPs selected at stage 1, at stage 2. For 8000 cases and 8000 controls available for study and for odds ratios per allele in the range 1.1-1.3, we show that DP is substantially reduced for designs with pi(sample)<or= 0.25, and that DP cannot be appreciably increased by analyzing the stage 1 and stage 2 data jointly. These results suggest that multistage designs with small first stages (e.g. pi(sample)<or= 0.25) should be avoided, and that additional genotyping in earlier studies with small first stages will yield previously unselected disease-associated SNPs.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Genetic Predisposition to Disease*
  • Genome, Human*
  • Humans
  • Linkage Disequilibrium
  • Models, Statistical*
  • Polymorphism, Single Nucleotide