Haplotype-resolved whole-genome sequencing by contiguity-preserving transposition and combinatorial indexing

Nat Genet. 2014 Dec;46(12):1343-9. doi: 10.1038/ng.3119. Epub 2014 Oct 19.

Abstract

Haplotype-resolved genome sequencing enables the accurate interpretation of medically relevant genetic variation, deep inferences regarding population history and non-invasive prediction of fetal genomes. We describe an approach for genome-wide haplotyping based on contiguity-preserving transposition (CPT-seq) and combinatorial indexing. Tn5 transposition is used to modify DNA with adaptor and index sequences while preserving contiguity. After DNA dilution and compartmentalization, the transposase is removed, resolving the DNA into individually indexed libraries. The libraries in each compartment, enriched for neighboring genomic elements, are further indexed via PCR. Combinatorial 96-plex indexing at both the transposition and PCR stage enables the construction of phased synthetic reads from each of the nearly 10,000 'virtual compartments'. We demonstrate the feasibility of this method by assembling >95% of the heterozygous variants in a human genome into long, accurate haplotype blocks (N50 = 1.4-2.3 Mb). The rapid, scalable and cost-effective workflow could enable haplotype resolution to become routine in human genome sequencing.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Chromosome Mapping
  • Cluster Analysis
  • DNA / genetics
  • Female
  • Gene Library
  • Genome, Human
  • Genomics
  • Haplotypes*
  • Heterozygote
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Male
  • Polymerase Chain Reaction
  • Reproducibility of Results
  • Sequence Analysis, DNA / methods*
  • Transposases / genetics

Substances

  • DNA
  • Transposases

Associated data

  • BioProject/PRJNA241346