Epigenetic Conservation at Gene Regulatory Elements Revealed by Non-Methylated DNA Profiling in Seven Vertebrates
Free PMC article
Item in Clipboard
Epigenetic Conservation at Gene Regulatory Elements Revealed by Non-Methylated DNA Profiling in Seven Vertebrates
Free PMC article
Two-thirds of gene promoters in mammals are associated with regions of non-methylated DNA, called CpG islands (CGIs), which counteract the repressive effects of DNA methylation on chromatin. In cold-blooded vertebrates, computational CGI predictions often reside away from gene promoters, suggesting a major divergence in gene promoter architecture across vertebrates. By experimentally identifying non-methylated DNA in the genomes of seven diverse vertebrates, we instead reveal that non-methylated islands (NMIs) of DNA are a central feature of vertebrate gene promoters. Furthermore, NMIs are present at orthologous genes across vast evolutionary distances, revealing a surprising level of conservation in this epigenetic feature. By profiling NMIs in different tissues and developmental stages we uncover a unifying set of features that are central to the function of NMIs in vertebrates. Together these findings demonstrate an ancient logic for NMI usage at gene promoters and reveal an unprecedented level of epigenetic conservation across vertebrate evolution. DOI:http://dx.doi.org/10.7554/eLife.00348.001.
Chicken; Chromatin; CpG islands; DNA methylation; Epigenetics; Evolutionary conservation; Human; Mouse; Xenopus; Zebrafish.
Conflict of interest statement
CPP: Senior Editor,
The other authors declare that no competing interests exist.
Figure 1.. CpG island predictions do not accurately identify non-methylated islands of DNA in vertebrate genomes.
A) Non-methylated DNA profiles in testes at a representative syntenic region for seven vertebrate species. Genes are shown in black (improved annotation of gene TSSs using RNA-seq data is shown in red), CpG island predictions in green (CGI), and non-methylated DNA profiles are shown in blue. A phylogenetic tree (left) highlights the evolutionary relationship among the seven species. Dashed grey lines highlight the relationship between the gene TSSs across the species. A gap in the zebrafish profile indicates that aptx is found at a separate locus from dnaja1 and smu1. ( B) The genome-wide overlap between CpG islands (green) and non-methylated islands (blue) is depicted as a Venn diagram for each of the species. ( C) Nucleotide properties of non-methylated islands and control regions are depicted as density plots. CpG observed/expected (left) and GC content (right) are shown for NMI and control regions of the genome. Median values are shown as dark vertical lines. Thresholds for CpG island prediction are indicated (black dashed line). DOI:
Figure 1—figure supplement 1.. NMIs are a conserved feature of vertebrate promoters as illustrated by two syntenic loci.
A) and ( B) Profiles of non-methylated DNA are shown in testes at two representative syntenic regions for seven vertebrate species. Genes are shown in black (improved annotation of gene TSSs using RNA-seq data is shown in red), CpG island predications in green, and non-methylated DNA profiles are shown in blue. A phylogenetic tree (left) highlights the evolutionary relationship among the seven species and dashed grey lines highlight the relationship between the gene TSSs across the species. DOI:
Figure 2.. Non-methylated islands are associated with gene promoters in vertebrate genomes.
A) A histogram depicting the proportion of protein-coding transcription start sites (TSSs) which are overlapped by an NMI for all seven species. Blue bars indicate overlap with annotated TSSs and red bars indicate overlap with additional TSSs identified using RNA-seq data (platypus, chicken and lizard) or Xtev gene sets (frog). ( B) Profiles of non-methylated DNA were plotted over a 6-kb window centred on all TSSs with an NMI (dark blue), without an NMI (blue), and for all transcription termination sites (TTS, black). The non-methylated DNA signal peaks at the TSS of gene promoters in all vertebrates. DOI:
Figure 3.. Non-methylated islands are a highly conserved epigenetic feature of vertebrate gene promoters.
A) The presence of NMIs at orthologous gene TSSs is preserved as illustrated by a pairwise analysis of NMIs at vertebrate gene orthologues. The percentage of NMIs conserved at orthologous gene TSSs was calculated in a pairwise manner and found to be highly statistically significant for all comparisons across the seven vertebrate species (p<10 −10, hypergeometric test). ( B) A proportional Venn diagram illustrating the three-way comparison of NMI presence at conserved human-mouse-zebrafish gene orthologue TSSs. DOI:
Figure 4.. Intergenic NMIs are associated with distal regulatory elements, non-coding RNAs, and unannotated transcripts.
A) Most NMIs are associated with known protein-coding genes (left) but a substantial proportion are located within intergenic regions of the genome (right). ( B) NMIs (green) are found at 45% and 64% of all known long non-coding RNA (lncRNA) TSSs (black) in mouse and zebrafish respectively. ( C) A pie chart depicting the proportion of intergenic NMIs (>5 kb from a protein-coding gene) associated with different genomic features in mouse embryonic stem (ES) cells and zebrafish 24 hpf embryos. The association was performed hierarchically in the following order: lncRNA TSSs, other non-coding RNA TSSs (miRNAs, rRNAs, snRNAs, or snoRNAs), other TSSs (pseudogenes and processed transcripts), putative enhancer mark H3K4me1 and novel RNA-seq TSSs. This analysis indicates that intergenic NMIs mark novel transcriptional units or regulatory elements. DOI:
Figure 5.. Differential methylation of a subset of NMIs.
A) All vertebrate genomes have a subset of NMIs that are subject to differential methylation as illustrated by a heat map of non-methylated DNA signal from testes and liver in human, mouse and zebrafish. In each case NMIs are ranked according to length and clustered as shared (upper) or unique (lower) between the two tissues. A 5-kb window centred at the NMI is shown and read density is indicated by colour intensity. ( B) The overlap of NMIs identified in liver and testes is depicted by Venn diagrams for NMIs associated with protein-coding TSSs (upper) and for NMIs away from TSSs (lower). NMIs at TSSs are generally non-methylated in both tissues whereas differentially methylated NMIs tend to be found away from TSSs. ( C) NMI length distribution plots for shared (Shared NMIs, solid line) or unique (Unique NMIs, dashed line) NMIs from testes (blue) or liver (red). Shared NMIs tend to be longer than tissue-specific unique NMIs. ( D) CpG density distribution plots for shared (solid line) or unique (dashed line) NMIs from testes (blue) or liver (red). Shared NMIs tend to have higher CpG density than unique NMIs. DOI:
Figure 5—figure supplement 1.. Validation of differentially methylated NMIs between liver and testes in mouse and zebrafish by bisulfite sequencing.
A, i–iv) Mouse NMIs unique to liver or testes were analysed by bisulfite sequencing to verify that the regions were indeed differentially methylated. Traces of non-methylated DNA are depicted for differentially methylated regions in mouse liver (red) and testes (blue) with NMIs depicted as bars under the traces. The y-axis depicts read density. Methylation status of the unique NMIs was confirmed using the indicated bisulfite PCR amplicon (BA, black rectangle). Empty and filled circles represent non-methylated and methylated CpG dinucleotides, respectively. ( B, ( i–iii) Zebrafish NMIs unique to liver or testes were validated by bisulfite sequencing as in ( A). DOI:
Figure 5—figure supplement 2.. Differential methylation of NMIs in platypus, chicken, lizard and frog and length distributions of NMIs from all seven vertebrates.
A) A heat map of non-methylated DNA signal from testes and liver in platypus, chicken, lizard and frog. In each case NMIs are ranked according to length and clustered as shared (upper) or unique (lower) between the two tissues. A 5-kb window centred at the NMI is shown and read density is indicated by colour intensity. ( B) Venn diagrams demonstrate that shared NMIs are found predominantly at protein-coding gene TSSs (upper) and unique NMIs tend to be found away from TSSs (lower). ( C) NMI length distribution plots for shared (Shared NMIs, solid line) or unique (Unique NMIs, dashed line) NMIs from testes (blue) or liver (red). Shared NMIs tend to be longer than tissue-specific unique NMIs. ( D) CpG density distribution plots for shared (solid line) or unique (dashed line) NMIs from testes (blue) or liver (red). Shared NMIs tend to have higher CpG density than unique NMIs. DOI:
Figure 5—figure supplement 3.. Genes with TSS-associated testes or liver specific NMIs are over-represented for increased differential expression in the same tissue.
MA plots depicting expression differences for genes with TSS-associated NMIs from liver and testes for human, mouse, platypus and chicken. Genes are coloured according to whether they share an NMI in both liver and testes (grey) or have an NMI only in liver (red) or testes (blue). Genes are further distinguished as being differentially expressed or overexpressed in a tissue-specific manner (dark, filled circle) or not (light, open circles). The log mean expression of the gene from both liver and testes is displayed on the x axis (A) and the log ratio of gene expression is displayed on the y axis (M). The dotted lines indicate a fold change threshold of two. Genes with tissue-specific NMIs were significantly over-represented in the set of genes which had increased differential expression in seven out of eight cases (Fisher's exact test, human testes p<10
−21, liver p<10 −27; mouse testes p<10 −18, liver p<10 −8; platypus testes p<10 −2, liver p<10 −17; chicken liver p<10 −6). DOI:
Figure 6.. Chromatin modification at NMIs depends on their underlying DNA methylation state.
A) H3K4me3 read density from testes (blue) and liver (red) is profiled over testes unique (left) and liver unique (right) NMIs for human (upper) and mouse (lower) and displayed as an average profile. At differentially methylated loci, the histone H3K4me3 modification is found preferentially in the tissue with the non-methylated NMI. ( B) The H3K4me3 signal (profiled in frog stage 11–12 embryos and zebrafish 24 hpf) is present specifically at unique NMIs from frog stage 11–12 and zebrafish 24 hpf (green) and not at unique NMIs from the liver (red). DOI:
Figure 7.. A unique class of broad non-methylated islands encompass polycomb-regulated developmental genes.
A) An example of a broad region of non-methylated DNA associated with the sp9 gene for four representative species (human, mouse, frog and fish). Dashed grey lines highlight the location of the gene TSSs across the four species. ( B) Non-methylated DNA profiles are depicted for genes associated with broad NMIs (dark blue) and canonical NMIs (light blue) in mouse embryonic stem (ES) cells and frog stage 11–12. The profile is scaled to show an averaged gene with one gene length depicted upstream and downstream. ( C) H3K4me3 ChIP-seq signal from mouse and frog was plotted as in ( B). H3K4me3 profiles reflect the underlying non-methylated DNA profiles. ( D) Genes associated with broad NMIs were analysed by gene ontology (GO) analysis for mouse ES cell and frog stage 11–12. Broad NMIs are found to be significantly enriched for GO term categories associated with sequence-specific DNA binding, transcriptional regulation and development. MF: molecular function; BP: biological process. p<10 −5 for all GO terms. ( E) H3K27me3 ChIP-seq signal from mouse and frog was plotted for the same gene sets as in ( B). The profile is scaled to show an averaged gene with three gene lengths depicted upstream and downstream. As for H3K4me3, H3K27me3 ChIP-seq profiles correspond to the underlying non-methylated DNA profile. ( F) A representative example of two broadly non-methylated genes gsx1 and nkx2.2 for mouse and frog. In both species, the broad non-methylated regions (green) are associated with the polycomb repressive mark H3K27me3 (red). In addition, in mouse, polycomb repressive complex 2 (ezh2, yellow and suz12, orange) and polycomb repressive complex 1 (ring1b, purple) components are associated with the broad non-methylated regions. The y-axis depicts read density. Genes are depicted above the profiles in black. DOI:
Figure 7—figure supplement 1..
Hox gene clusters are characterized by broad NMIs.
hoxa gene cluster from all seven vertebrate species is associated with broad regions of non-methylated DNA. Genes are shown in black and non-methylated DNA profiles are shown in blue and dashed grey lines highlight the relationship between conserved gene TSSs across the species. DOI:
All figures (12)
eLife. doi: 10.7554/eLife.00593
Improved Prediction of Non-methylated Islands in Vertebrates Highlights Different Characteristic Sequence Patterns.
PLoS Comput Biol. 2016 Dec 16;12(12):e1005249. doi: 10.1371/journal.pcbi.1005249. eCollection 2016 Dec.
PLoS Comput Biol. 2016.
27984582 Free PMC article.
Protection of CpG islands from DNA methylation is DNA-encoded and evolutionarily conserved.
Nucleic Acids Res. 2016 Aug 19;44(14):6693-706. doi: 10.1093/nar/gkw258. Epub 2016 Apr 15.
Nucleic Acids Res. 2016.
27084945 Free PMC article.
Genome-wide DNA methylome variation in two genetically distinct chicken lines using MethylC-seq.
BMC Genomics. 2015 Oct 23;16:851. doi: 10.1186/s12864-015-2098-8.
BMC Genomics. 2015.
26497311 Free PMC article.
CpG islands: algorithms and applications in methylation studies.
Biochem Biophys Res Commun. 2009 May 15;382(4):643-5. doi: 10.1016/j.bbrc.2009.03.076. Epub 2009 Mar 18.
Biochem Biophys Res Commun. 2009.
19302978 Free PMC article.
CpG islands--'a rough guide'.
FEBS Lett. 2009 Jun 5;583(11):1713-20. doi: 10.1016/j.febslet.2009.04.012. Epub 2009 Apr 18.
FEBS Lett. 2009.
Parallel PRC2/cPRC1 and vPRC1 pathways silence lineage-specific genes and maintain self-renewal in mouse embryonic stem cells.
Sci Adv. 2020 Apr 1;6(14):eaax5692. doi: 10.1126/sciadv.aax5692. eCollection 2020 Apr.
Sci Adv. 2020.
32270030 Free PMC article.
The Role of Polycomb Repressive Complex in Malignant Peripheral Nerve Sheath Tumor.
Genes (Basel). 2020 Mar 9;11(3):287. doi: 10.3390/genes11030287.
Genes (Basel). 2020.
32182803 Free PMC article.
PRC1 Catalytic Activity Is Central to Polycomb System Function.
Mol Cell. 2020 Feb 20;77(4):857-874.e9. doi: 10.1016/j.molcel.2019.12.001. Epub 2019 Dec 27.
Mol Cell. 2020.
31883950 Free PMC article.
DNA Methylation: Shared and Divergent Features across Eukaryotes.
Trends Genet. 2019 Nov;35(11):818-827. doi: 10.1016/j.tig.2019.07.007. Epub 2019 Aug 6.
Trends Genet. 2019.
Aday AW, Zhu LJ, Lakshmanan A, Wang J, Lawson ND. 2011. Identification of cis regulatory features in the embryonic zebrafish genome through large-scale profiling of H3K4me1 and H3K4me3 binding sites. Dev Biol 357:450–62. 10.1016/j.ydbio.2011.03.007
Aïssani B, Bernardi G. 1991. CpG islands: features and distribution in the genomes of vertebrates. Gene 106:173–83
Akkers RC, van Heeringen SJ, Jacobi UG, Janssen-Megens EM, Kees-Jan Françoijs K-J, Stunnenberg HG, et al. 2009. A hierarchy of H3K4me3 and H3K27me3 acquisition in spatial gene regulation in Xenopus embryos. Dev Cell 17:425–34. 10.1016/j.devcel.2009.08.005
Akkers RC, van Heeringen SJ, Manak JR, Green RD, Stunnenberg HG, Veenstra GJC. 2010. ChIP-chip designs to interrogate the genome of Xenopus embryos for transcription factor binding and epigenetic regulation. PloS One 5:e8820. 10.1371/journal.pone.0008820
Anders S, Huber W. 2010. Differential expression analysis for sequence count data. Genome Biol 11:R106. 10.1186/gb-2010-11-10-r106
Research Support, Non-U.S. Gov't
Chromatin Assembly and Disassembly
Gene Expression Profiling* / methods
Promoter Regions, Genetic*
LinkOut - more resources
Full Text Sources Molecular Biology Databases