Discovery of rare variants via sequencing: implications for the design of complex trait association studies

PLoS Genet. 2009 May;5(5):e1000481. doi: 10.1371/journal.pgen.1000481. Epub 2009 May 15.


There is strong evidence that rare variants are involved in complex disease etiology. The first step in implicating rare variants in disease etiology is their identification through sequencing in both randomly ascertained samples (e.g., the 1,000 Genomes Project) and samples ascertained according to disease status. We investigated to what extent rare variants will be observed across the genome and in candidate genes in randomly ascertained samples, the magnitude of variant enrichment in diseased individuals, and biases that can occur due to how variants are discovered. Although sequencing cases can enrich for casual variants, when a gene or genes are not involved in disease etiology, limiting variant discovery to cases can lead to association studies with dramatically inflated false positive rates.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Chromosome Mapping
  • False Positive Reactions
  • Gene Frequency
  • Genetic Linkage
  • Genetic Predisposition to Disease
  • Genetic Variation*
  • Genome, Human
  • Genome-Wide Association Study / methods*
  • Genome-Wide Association Study / statistics & numerical data
  • Haplotypes
  • Humans
  • Models, Genetic
  • Models, Statistical
  • Polymorphism, Single Nucleotide
  • Probability
  • Sequence Analysis, DNA / methods*