Cytosine deamination is a major cause of baseline noise in next-generation sequencing

Mol Diagn Ther. 2014 Oct;18(5):587-93. doi: 10.1007/s40291-014-0115-2.


Background and objectives: As next-generation sequencing (NGS) becomes a major sequencing platform in clinical diagnostic laboratories, it is critical to identify artifacts that constitute baseline noise and may interfere with detection of low-level gene mutations. This is especially critical for applications requiring ultrasensitive detection, such as molecular relapse of solid tumors and early detection of cancer. We recently observed a ~10-fold higher frequency of C:G > T:A mutations than the background noise level in both wild-type peripheral blood and formalin-fixed paraffin-embedded samples. We hypothesized that these might represent cytosine deamination events, which have been seen using other platforms.

Methods: To test this hypothesis, we pretreated samples with uracil N-glycosylase (UNG). Additionally, to test whether some of the cytosine deamination might be a laboratory artifact, we simulated the heat associated with polymerase chain reaction thermocycling by subjecting samples to thermocycling in the absence of polymerase. To test the safety of universal UNG pretreatment, we tested known positive samples treated with UNG.

Results: UNG pretreatment significantly reduced the frequencies of these mutations, consistent with a biologic source of cytosine deamination. The simulated thermocycling-heated samples demonstrated significantly increased frequencies of C:G > T:A mutations without other baseline base substitutions being affected. Samples with known mutations demonstrated no decrease in our ability to detect these after treatment with UNG.

Conclusion: Baseline noise during NGS is mostly due to cytosine deamination, the source of which is likely to be both biologic and an artifact of thermocycling, and it can be reduced by UNG pretreatment.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Artifacts*
  • Cytosine / metabolism
  • DNA / analysis
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Mutation
  • Sensitivity and Specificity
  • Sequence Analysis, DNA / methods*
  • Temperature
  • Uracil-DNA Glycosidase / metabolism*


  • Cytosine
  • DNA
  • Uracil-DNA Glycosidase