The power of genome-wide association studies of complex disease genes: statistical limitations of indirect approaches using SNP markers

J Hum Genet. 2001;46(8):478-82. doi: 10.1007/s100380170048.


Genome-wide association studies using a dense map of single nucleotide polymorphism (SNP) markers seem to enable us to detect a number of complex disease genes. In such indirect association studies, whether susceptibility genes can be detected is dependent not only on the degree of linkage disequilibrium between the disease variant and the SNP marker but also on the difference in their allele frequencies. These factors, as well as penetrance of the disease variant, influence the statistical power of such approaches. However, the power of indirect association studies is not well understood. We calculated the number of individuals necessary for the detection of the disease variant in both direct and indirect association studies with a case-control design. The result shows that a remarkable reduction in the statistical power of indirect studies, compared with that of direct ones, is unavoidable in the genome-wide screening of complex disease genes. If there is a large difference in allele frequency between the disease variant and the marker, the disease variant cannot be detected. Because the frequency of the disease variant is unknown, SNP markers with various allele frequencies, or a large number of SNP markers, must be used in indirect association studies. However, if the number of SNP markers is increased, the obtained P value may not reach the significance level due to the Bonferroni adjustment. Thus, to test a possible association between functional variants and a complex disease directly, we should identify such SNPs in as many genes as possible for use in genome-wide association studies.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Gene Frequency
  • Genetic Linkage
  • Genetic Markers
  • Genetic Predisposition to Disease
  • Genetic Variation
  • Genome, Human
  • Humans
  • Linkage Disequilibrium*
  • Models, Genetic
  • Models, Statistical
  • Polymorphism, Single Nucleotide / genetics*
  • Research Design / statistics & numerical data*
  • Sample Size


  • Genetic Markers