The sequence preference of DNA methylation variation in mammalians

PLoS One. 2017 Oct 18;12(10):e0186559. doi: 10.1371/journal.pone.0186559. eCollection 2017.

Abstract

Methylation of cytosine at the 5 position of the pyrimidine ring is the most prevalent and significant epigenetic modifications in mammalian DNA. The CpG methylation level shows a bimodal distribution but the bimodality can be overestimated due to the heterogeneity of per-base depth. Here, we developed an algorithm to eliminate the effect of per-base depth inhomogeneity on the bimodality and obtained a random CpG methylation distribution. By quantifying the deviation of the observed methylation distribution and the random one using the information formula, we find that in tetranucleotides 5'-N5CGN3-3' (N5, N3 = A, C, G or T), GCGN3 and CCGN3 show less apparent deviation than ACGN3 and TCGN3, indicating that GCGN3 and CCGN3 are less variant in their level of methylation. The methylation variation of N5CGN3 are conserved among different cells, tissues and species, implying common features in the mechanisms of methylation and demethylation, presumably mediated by DNMTs and TETs in mammalians, respectively. Sequence dependence of DNA methylation variation also relates to gene regulatory and promotes the reexamination of the role of DNA sequence in fundamental biological processes.

MeSH terms

  • Animals
  • Base Sequence
  • Brain / metabolism
  • Chromosomes, Human, Pair 1 / genetics
  • Conserved Sequence / genetics
  • CpG Islands / genetics
  • DNA Methylation / genetics*
  • Humans
  • Mammals / genetics*
  • Mice
  • Nucleotides / genetics
  • Organ Specificity / genetics
  • Sequence Analysis, DNA
  • Species Specificity

Substances

  • Nucleotides

Grants and funding

This work was supported by the National Natural Science Foundation of China under 21573006, 21233002. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.