Detecting DNA modifications from SMRT sequencing data by modeling sequence context dependence of polymerase kinetic

PLoS Comput Biol. 2013;9(3):e1002935. doi: 10.1371/journal.pcbi.1002935. Epub 2013 Mar 14.

Abstract

DNA modifications such as methylation and DNA damage can play critical regulatory roles in biological systems. Single molecule, real time (SMRT) sequencing technology generates DNA sequences as well as DNA polymerase kinetic information that can be used for the direct detection of DNA modifications. We demonstrate that local sequence context has a strong impact on DNA polymerase kinetics in the neighborhood of the incorporation site during the DNA synthesis reaction, allowing for the possibility of estimating the expected kinetic rate of the enzyme at the incorporation site using kinetic rate information collected from existing SMRT sequencing data (historical data) covering the same local sequence contexts of interest. We develop an Empirical Bayesian hierarchical model for incorporating historical data. Our results show that the model could greatly increase DNA modification detection accuracy, and reduce requirement of control data coverage. For some DNA modifications that have a strong signal, a control sample is not even needed by using historical data as alternative to control. Thus, sequencing costs can be greatly reduced by using the model. We implemented the model in a R package named seqPatch, which is available at https://github.com/zhixingfeng/seqPatch.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • Computational Biology / methods*
  • DNA Methylation
  • DNA, Bacterial / chemistry*
  • DNA, Bacterial / genetics
  • DNA, Bacterial / metabolism
  • DNA-Directed DNA Polymerase / metabolism
  • Escherichia coli / genetics
  • Kinetics
  • Models, Genetic
  • Nucleic Acid Conformation
  • Sequence Analysis, DNA / methods*

Substances

  • DNA, Bacterial
  • DNA-Directed DNA Polymerase

Grant support

This study was carried out at and funded by Pacific Biosciences. Since we are employees of Pacific Biosciences and Pacific Biosciences funded the work, the funders and those devising the study design, data collection, analysis, decision to publish and prepare the manuscript are one and the same.