Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data

Am J Hum Genet. 2008 Sep;83(3):311-21. doi: 10.1016/j.ajhg.2008.06.024. Epub 2008 Aug 7.


Although whole-genome association studies using tagSNPs are a powerful approach for detecting common variants, they are underpowered for detecting associations with rare variants. Recent studies have demonstrated that common diseases can be due to functional variants with a wide spectrum of allele frequencies, ranging from rare to common. An effective way to identify rare variants is through direct sequencing. The development of cost-effective sequencing technologies enables association studies to use sequence data from candidate genes and, in the future, from the entire genome. Although methods used for analysis of common variants are applicable to sequence data, their performance might not be optimal. In this study, it is shown that the collapsing method, which involves collapsing genotypes across variants and applying a univariate test, is powerful for analyzing rare variants, whereas multivariate analysis is robust against inclusion of noncausal variants. Both methods are superior to analyzing each variant individually with univariate tests. In order to unify the advantages of both collapsing and multiple-marker tests, we developed the Combined Multivariate and Collapsing (CMC) method and demonstrated that the CMC method is both powerful and robust. The CMC method can be applied to either candidate-gene or whole-genome sequence data.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Genetic Markers
  • Genetic Predisposition to Disease*
  • Genetic Variation*
  • Genome, Human*
  • Haplotypes
  • Humans
  • Linkage Disequilibrium
  • Models, Genetic*
  • Polymorphism, Single Nucleotide*


  • Genetic Markers