Predicting double-strand DNA breaks using epigenome marks or DNA at kilobase resolution

Genome Biol. 2018 Mar 15;19(1):34. doi: 10.1186/s13059-018-1411-7.


Double-strand breaks (DSBs) result from the attack of both DNA strands by multiple sources, including radiation and chemicals. DSBs can cause the abnormal chromosomal rearrangements associated with cancer. Recent techniques allow the genome-wide mapping of DSBs at high resolution, enabling the comprehensive study of their origins. However, these techniques are costly and challenging. Hence, we devise a computational approach to predict DSBs using the epigenomic and chromatin context, for which public data are readily available from the ENCODE project. We achieve excellent prediction accuracy at high resolution. We identify chromatin accessibility, activity, and long-range contacts as the best predictors.

Keywords: Chromatin; Double-strand breaks; Epigenetics; Machine learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cell Line
  • Chromatin / metabolism
  • DNA / chemistry*
  • DNA Breaks, Double-Stranded*
  • Epigenesis, Genetic*
  • Histone Code
  • Humans
  • Nucleotide Motifs


  • Chromatin
  • DNA