Analysis of multiple SNPs in a candidate gene or region

Genet Epidemiol. 2008 Sep;32(6):560-6. doi: 10.1002/gepi.20330.


We consider the analysis of multiple single nucleotide polymorphisms (SNPs) within a gene or region. The simplest analysis of such data is based on a series of single SNP hypothesis tests, followed by correction for multiple testing, but it is intuitively plausible that a joint analysis of the SNPs will have higher power, particularly when the causal locus may not have been observed. However, standard tests, such as a likelihood ratio test based on an unrestricted alternative hypothesis, tend to have large numbers of degrees of freedom and hence low power. This has motivated a number of alternative test statistics. Here we compare several of the competing methods, including the multivariate score test (Hotelling's test) of Chapman et al. ([2003] Hum. Hered. 56:18-31), Fisher's method for combining P-values, the minimum P-value approach, a Fourier-transform-based approach recently suggested by Wang and Elston ([2007] Am. J. Human Genet. 80:353-360) and a Bayesian score statistic proposed for microarray data by Goeman et al. ([2005] J. R. Stat. Soc. B 68:477-493). Some relationships between these methods are pointed out, and simulation results given to show that the minimum P-value and the Goeman et al. ([2005] J. R. Stat. Soc. B 68:477-493) approaches work well over a range of scenarios. The Wang and Elston approach often performs poorly; we explain why, and show how its performance can be substantially improved.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Bayes Theorem
  • Computer Simulation
  • Fourier Analysis
  • Gene Frequency
  • Genotype
  • Humans
  • Linear Models
  • Models, Genetic*
  • Models, Statistical*
  • Multivariate Analysis
  • Polymorphism, Single Nucleotide*