Ligation bias in illumina next-generation DNA libraries: implications for sequencing ancient genomes

PLoS One. 2013 Oct 29;8(10):e78575. doi: 10.1371/journal.pone.0078575. eCollection 2013.

Abstract

Ancient DNA extracts consist of a mixture of endogenous molecules and contaminant DNA templates, often originating from environmental microbes. These two populations of templates exhibit different chemical characteristics, with the former showing depurination and cytosine deamination by-products, resulting from post-mortem DNA damage. Such chemical modifications can interfere with the molecular tools used for building second-generation DNA libraries, and limit our ability to fully characterize the true complexity of ancient DNA extracts. In this study, we first use fresh DNA extracts to demonstrate that library preparation based on adapter ligation at AT-overhangs are biased against DNA templates starting with thymine residues, contrarily to blunt-end adapter ligation. We observe the same bias on fresh DNA extracts sheared on Bioruptor, Covaris and nebulizers. This contradicts previous reports suggesting that this bias could originate from the methods used for shearing DNA. This also suggests that AT-overhang adapter ligation efficiency is affected in a sequence-dependent manner and results in an uneven representation of different genomic contexts. We then show how this bias could affect the base composition of ancient DNA libraries prepared following AT-overhang ligation, mainly by limiting the ability to ligate DNA templates starting with thymines and therefore deaminated cytosines. This results in particular nucleotide misincorporation damage patterns, deviating from the signature generally expected for authenticating ancient sequence data. Consequently, we show that models adequate for estimating post-mortem DNA damage levels must be robust to the molecular tools used for building ancient DNA libraries.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artifacts
  • DNA Damage
  • DNA, Bacterial / genetics
  • Gene Library*
  • Genomics*
  • High-Throughput Nucleotide Sequencing / methods*

Substances

  • DNA, Bacterial

Grant support

MS was supported by the Lundbeck Foundation (R52-A5062). This work was supported by the Danish Council for Independent Research (FNU); the Danish National Research Foundation and a Marie-Curie Career Integration Grant (FP7 CIG-293845). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.