Determination of haplotypes at structurally complex regions using emulsion haplotype fusion PCR

BMC Genomics. 2012 Dec 11:13:693. doi: 10.1186/1471-2164-13-693.

Abstract

Background: Genotyping and massively-parallel sequencing projects result in a vast amount of diploid data that is only rarely resolved into its constituent haplotypes. It is nevertheless this phased information that is transmitted from one generation to the next and is most directly associated with biological function and the genetic causes of biological effects. Despite progress made in genome-wide sequencing and phasing algorithms and methods, problems assembling (and reconstructing linear haplotypes in) regions of repetitive DNA and structural variation remain. These dynamic and structurally complex regions are often poorly understood from a sequence point of view. Regions such as these that are highly similar in their sequence tend to be collapsed onto the genome assembly. This is turn means downstream determination of the true sequence haplotype in these regions poses a particular challenge. For structurally complex regions, a more focussed approach to assembling haplotypes may be required.

Results: In order to investigate reconstruction of spatial information at structurally complex regions, we have used an emulsion haplotype fusion PCR approach to reproducibly link sequences of up to 1kb in length to allow phasing of multiple variants from neighbouring loci, using allele-specific PCR and sequencing to detect the phase. By using emulsion systems linking flanking regions to amplicons within the CNV, this led to the reconstruction of a 59kb haplotype across the DEFA1A3 CNV in HapMap individuals.

Conclusion: This study has demonstrated a novel use for emulsion haplotype fusion PCR in addressing the issue of reconstructing structural haplotypes at multiallelic copy variable regions, using the DEFA1A3 locus as an example.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Centromere / genetics
  • DNA Primers / genetics
  • Genome, Human / genetics*
  • Haplotypes / genetics*
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Polymerase Chain Reaction / methods*
  • Telomere / genetics

Substances

  • DNA Primers