An In-Solution Hybridisation Method for the Isolation of Pathogen DNA from Human DNA-rich Clinical Samples for Analysis by NGS

Open Genomics J. 2012:5:10.2174/1875693X01205010018. doi: 10.2174/1875693X01205010018.

Abstract

Studies on DNA from pathogenic organisms, within clinical samples, are often complicated by the presence of large amounts of host, e.g., human DNA. Isolation of pathogen DNA from these samples would improve the efficiency of next-generation sequencing (NGS) and pathogen identification. Here we describe a solution-based hybridisation method for isolation of pathogen DNA from a mixed population. This straightforward and inexpensive technique uses probes made from whole-genome DNA and off-the-shelf reagents. In this study, Escherichia coli DNA was successfully enriched from a mixture of E.coli and human DNA. After enrichment, genome coverage following NGS was significantly higher and the evenness of coverage and GC content were unaffected. This technique was also applied to samples containing a mixture of human and Plasmodium falciparum DNA. The P.falciparum genome is particularly difficult to sequence due to its high AT content (80.6%) and repetitive nature. Post enrichment, a bias in the recovered DNA was observed, with a poorer representation of the AT-rich non-coding regions. This uneven coverage was also observed in pre-enrichment samples, but to a lesser degree. Despite the coverage bias in enriched samples, SNP (single-nucleotide polymorphism) calling in coding regions was unaffected and the majority of samples had over 90% of their coding region covered at 5× depth. This technique shows significant promise as an effective method to enrich pathogen DNA from samples with heavy human contamination, particularly when applied to GC-neutral genomes.

Keywords: AT-rich DNA; E.coli; P.falciparum; clinical samples; enrichment; host DNA contamination; in-solution hybridisation; next generation sequencing.