Probability that a two-stage genome-wide association study will detect a disease-associated snp and implications for multistage designs

M H Gail; R M Pfeiffer; W Wheeler; D Pee

doi:10.1111/j.1469-1809.2008.00467.x

Probability that a two-stage genome-wide association study will detect a disease-associated snp and implications for multistage designs

Ann Hum Genet. 2008 Nov;72(Pt 6):812-20. doi: 10.1111/j.1469-1809.2008.00467.x. Epub 2008 Jul 24.

Authors

M H Gail¹, R M Pfeiffer, W Wheeler, D Pee

Affiliation

¹ Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892-7244, US. gailm@mail.nih.gov

Abstract

Large two-stage genome-wide association studies (GWASs) have been shown to reduce required genotyping with little loss of power, compared to a one-stage design, provided a substantial fraction of cases and controls, pi(sample), is included in stage 1. However, a number of recent GWASs have used pi(sample) < 0.2. Moreover, standard power calculations are not applicable because SNPs are selected in stage 1 by ranking their p-values, rather than comparing each SNP's statistic to a fixed critical value. We define the detection probability (DP) of a two-stage design as the probability that a given disease-associated SNP will have a p-value among the lowest ranks of p-values at stage 1, and, among those SNPs selected at stage 1, at stage 2. For 8000 cases and 8000 controls available for study and for odds ratios per allele in the range 1.1-1.3, we show that DP is substantially reduced for designs with pi(sample)<or= 0.25, and that DP cannot be appreciably increased by analyzing the stage 1 and stage 2 data jointly. These results suggest that multistage designs with small first stages (e.g. pi(sample)<or= 0.25) should be avoided, and that additional genotyping in earlier studies with small first stages will yield previously unselected disease-associated SNPs.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Genetic Predisposition to Disease*
Genome, Human*
Humans
Linkage Disequilibrium
Models, Statistical*
Polymorphism, Single Nucleotide

Grants and funding

Z01 CP010181-05/ImNIH/Intramural NIH HHS/United States