Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 461 (7261), 272-6

Targeted Capture and Massively Parallel Sequencing of 12 Human Exomes


Targeted Capture and Massively Parallel Sequencing of 12 Human Exomes

Sarah B Ng et al. Nature.


Genome-wide association studies suggest that common genetic variants explain only a modest fraction of heritable risk for common diseases, raising the question of whether rare variants account for a significant fraction of unexplained heritability. Although DNA sequencing costs have fallen markedly, they remain far from what is necessary for rare and novel variants to be routinely identified at a genome-wide scale in large cohorts. We have therefore sought to develop second-generation methods for targeted sequencing of all protein-coding regions ('exomes'), to reduce costs while enriching for discovery of highly penetrant variants. Here we report on the targeted capture and massively parallel sequencing of the exomes of 12 humans. These include eight HapMap individuals representing three populations, and four unrelated individuals with a rare dominantly inherited disorder, Freeman-Sheldon syndrome (FSS). We demonstrate the sensitive and specific identification of rare and common variants in over 300 megabases of coding sequence. Using FSS as a proof-of-concept, we show that candidate genes for Mendelian disorders can be identified by exome sequencing of a small number of unrelated, affected individuals. This strategy may be extendable to diseases with more complex genetics through larger sample sizes and appropriate weighting of non-synonymous variants by predicted functional impact.


Figure 1
Figure 1. Minor allele frequency and coding indel length distributions
(a) The distribution of minor allele frequencies is shown for previously annotated versus novel cSNPs. (b) The distribution of minor allele frequencies is shown for synonymous versus nonsynonymous cSNPs. (c) The distribution of minor allele frequencies (by proportion, rather than count) is shown for synonymous cSNPs (n = 21,201) versus nonsynonymous cSNPs predicted to be benign (n = 13,295), possibly damaging (n = 3,368), or probably damaging (n = 2,227) by PolyPhen. (d) The distribution of lengths of coding insertion-deletion variants is shown (average numbers per exome). Error bars indicate s.d.
Figure 2
Figure 2. Direct identification of the causal gene for a monogenic disorder by exome sequencing
Boxes list the number of genes with 1+ nonsynonymous cSNP, splice-site SNP, or coding indel (“NS/SS/I”) meeting specified filters. Columns show the effect of requiring that 1+ NS/SS/I variants be observed in each of 1 to 4 affected individuals. Rows show the effect of excluding from consideration variants found in dbSNP, the 8 HapMap exomes, or both. Column 5 models limited genetic heterogeneity or data incompleteness by relaxing criteria such that variants need only be observed in any 3 of 4 exomes for a gene to qualify.

Similar articles

See all similar articles

Cited by 830 articles

See all "Cited by" articles


    1. Cohen JC, et al. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science. 2004;305(5685):869–872. - PubMed
    1. Frazer KA, Murray SS, Schork NJ, Topol EJ. Human genetic variation and its contribution to complex traits. Nature reviews. 2009;10(4):241–251. - PubMed
    1. Shendure J, Ji H. Next-generation DNA sequencing. Nat Biotechnol. 2008;26(10):1135–1145. - PubMed
    1. IHC A haplotype map of the human genome. Nature. 2005;437(7063):1299–1320. - PMC - PubMed
    1. Toydemir RM, et al. Mutations in embryonic myosin heavy chain (MYH3) cause Freeman-Sheldon syndrome and Sheldon-Hall syndrome. Nature genetics. 2006;38(5):561–565. - PubMed

Publication types