Haplotype-resolved genome sequencing of a Gujarati Indian individual

Nat Biotechnol. 2011 Jan;29(1):59-63. doi: 10.1038/nbt.1740. Epub 2010 Dec 19.

Abstract

Haplotype information is essential to the complete description and interpretation of genomes, genetic diversity and genetic ancestry. Although individual human genome sequencing is increasingly routine, nearly all such genomes are unresolved with respect to haplotype. Here we combine the throughput of massively parallel sequencing with the contiguity information provided by large-insert cloning to experimentally determine the haplotype-resolved genome of a South Asian individual. A single fosmid library was split into a modest number of pools, each providing ∼3% physical coverage of the diploid genome. Sequencing of each pool yielded reads overwhelmingly derived from only one homologous chromosome at any given location. These data were combined with whole-genome shotgun sequence to directly phase 94% of ascertained heterozygous single nucleotide polymorphisms (SNPs) into long haplotype blocks (N50 of 386 kilobases (kbp)). This method also facilitates the analysis of structural variation, for example, to anchor novel insertions to specific locations and haplotypes.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Asian Continental Ancestry Group / genetics*
  • Base Sequence
  • Cell Line
  • Genome, Human / genetics*
  • Haplotypes / genetics*
  • Heterozygote
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Models, Molecular
  • Polymorphism, Single Nucleotide / genetics
  • Sequence Analysis, DNA / methods*