Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jun 3;8(6):676-87.
doi: 10.1016/j.stem.2011.04.004.

DNA Methylation and SETDB1/H3K9me3 Regulate Predominantly Distinct Sets of Genes, Retroelements, and Chimeric Transcripts in mESCs

Free PMC article

DNA Methylation and SETDB1/H3K9me3 Regulate Predominantly Distinct Sets of Genes, Retroelements, and Chimeric Transcripts in mESCs

Mohammad M Karimi et al. Cell Stem Cell. .
Free PMC article


DNA methylation and histone H3 lysine 9 trimethylation (H3K9me3) play important roles in silencing of genes and retroelements. However, a comprehensive comparison of genes and repetitive elements repressed by these pathways has not been reported. Here we show that in mouse embryonic stem cells (mESCs), the genes upregulated after deletion of the H3K9 methyltransferase Setdb1 are distinct from those derepressed in mESC deficient in the DNA methyltransferases Dnmt1, Dnmt3a, and Dnmt3b, with the exception of a small number of primarily germline-specific genes. Numerous endogenous retroviruses (ERVs) lose H3K9me3 and are concomitantly derepressed exclusively in SETDB1 knockout mESCs. Strikingly, ~15% of upregulated genes are induced in association with derepression of promoter-proximal ERVs, half in the context of "chimeric" transcripts that initiate within these retroelements and splice to genic exons. Thus, SETDB1 plays a previously unappreciated yet critical role in inhibiting aberrant gene transcription by suppressing the expression of proximal ERVs.


Figure 1
Figure 1. SETDB1 and DNA methylation are required for silencing of predominantly distinct sets of genes
RNA-seq was performed on SETDB1 KO and DNMT TKO mESCs and their parent lines TT2 and J1, respectively. A–D. UCSC genome browser (mm9) screen shots showing mRNA levels across the MageA and Rhox gene clusters, as well as the germline-specific gene Dazl and the Mmp12 gene. E. Two-dimensional plots of all protein-coding Ensembl genes (22,848 total) with non-zero read coverage in either WT or KO lines are shown. Up- and down-regulated genes showing Z-score ≥ 1.2 and fold-change ≥ 2.0 are highlighted. F. The overlap in up-regulated genes is shown, along with G. the fraction of genes up-regulated in the SETDB1 KO line that are marked in the TT2 line by DNA methylation (Myant et al., 2011) and/or H3K9me3 in the promoter region (TSS +/−500bp). H–I. Similar analyses are shown for the down-regulated genes (see Figure S1 and S2).
Figure 2
Figure 2. SETDB1 bound loci are depleted of H3K9me3 in SETDB1 KO but not DNMT TKO cells
A. H3K9me3 RPKM values at genomic (light shading) or promoter (heavy shading) regions bound by SETDB1 are plotted for DNMT TKO vs. J1 and SETDB1 KO vs. TT2 lines and the number of genomic sites or promoter regions (in parentheses) losing or gaining H3K9me3 in the KO lines is shown. B. The number and percentage of SETDB1 bound, H3K9me3 marked promoter regions losing, gaining or showing no change in H3K9me3 in DNMT TKO and SETDB1 KO lines is shown. C. The percent and number of all genes or genes bound by SETDB1 in their promoter regions that are up-regulated are shown for each KO line. D. The percent and number of genes with SETDB1-bound promoters that lose H3K9me3 and are up-regulated in each KO line are shown (see Figure S1 and S2).
Figure 3
Figure 3. Genes depleted of promoter H3K9me3 in the SETDB1 KO are generally not marked by DNA methylation or H3K27me3
The DNA methylation (Myant et al., 2011) and H3K27me3 (Mikkelsen et al., 2007) states of genes depleted of H3K9me3 in their promoter regions (TSS +/−500bp) in the SETDB1 KO line A. showing no increase or B. increased expression, are shown. C. The tissue specificity of genes represented in the BioGPS database that are de-repressed in both the SETDB1 KO and DNMT TKO lines (30 of 39 total) is shown, along with the DNA methylation (Myant et al., 2011), H3K9me3 and SETDB1 binding (Yuan et al., 2009) states in the promoter regions of these genes (see Figure S2, Tables S1 and S2). Genes highlighted in yellow are expressed in the germline. NA, promoters of MGI gene not represented in the DNA methylation dataset.
Figure 4
Figure 4. ERVs are de-repressed in SETDB1 KO but not DNMT TKO mESCs
A. The sum of RNA-seq reads aligned to each annotated ERV subfamily was normalized to the total number of exonic reads and plotted for SETDB1 KO vs. TT2 and DNMT TKO vs. J1 lines. ERV subfamilies up or down-regulated in the KO lines are shown in red and blue, respectively. Subfamilies up-regulated in both lines are highlighted in green. B. For analysis of intact ERVs, the total normalized RNA-seq coverage for all annotated ERV internal regions flanked by their cognate LTRs was determined for representative class I, II and III ERV subfamilies, as well as LINE1MdA elements. The fold-change in expression for each pair of cells lines is shown. C. A screen shot of a representative ETnERV2/MusD element, including H3K9me3 NChIP-seq, RNA-seq and SETDB1 ChIP-seq (Yuan et al., 2009) tracks, is shown. D. The fold-change in H3K9me3 (including 1 kb of flanking genomic sequence) relative to the parent line for each subfamily presented in panel B is shown. E. The overlap between all annotated ERVs (+/−100 bp of flanking sequence) and mapped SETDB1 binding sites (threshold height >8) reveals that ~40% of all SETDB1 binding sites map within or near an annotated ERV (see Figures S3, S4 and Table S3). Random expectation of ~25% is based on 20 bootstraps (p-value <0.05).
Figure 5
Figure 5. Class I and II ERVs are simultaneously de-repressed and lose H3K9me3 exclusively in SETDB1 KO mESCs
Unambiguous RNA-seq and ChIP-seq reads aligning to ERVs with internal regions flanked by their cognate annotated LTRs were assembled as described in the Supplemental Information. A. RNA-seq and H3K9me3 RPKM values for ETn and ERVK10C ERVs are shown for TT2 vs. SETDB1 or J1 vs. DNMT TKO lines. B. Plotting H3K9me3 vs. RNA-seq Z-scores reveals that numerous ERVs lose H3K9me3 and are concomitantly de-repressed exclusively in the SETDB1 KO line. C. In contrast, L1 elements show no consistent changes in expression or H3K9me3 in either KO line. D. TT2 cells were transfected with siRNAs specific for Dnmt1 or Setdb1, alone or in combination and expression values relative to a scrambled siRNA control, was determined for several ERVs by qRT-PCR (technical replicates, mean +/− SD)(see also Figure S5, Table S4).
Figure 6
Figure 6. Increased genic expression in SETDB1 KO mESCs is associated with increased expression of promoter proximal ERVs
A. Protein coding genes were grouped according to the presence of an annotated ERV within 5kb of the annotated TSS(s) and then classified solely on the basis of the presence or absence of RNA-seq reads over these promoter proximal ERVs in the TT2 and/or SETDB1 KO lines. The distribution of RNA-seq coverage (normalized exonic RPKM) for genes with no proximal ERV is shown, along with genes harboring promoter proximal ERVs that are: 1. repressed in both lines (RNA-seq coverage <1.0 aRPKM)(See Supplemental Experimental Procedures); 2. expressed in both lines (RNA-seq coverage ≥1.0 aRPKM and SETDB1 KO aRPKM/TT2 aRPKM between .75 and 1.3); or 3. expressed predominantly in the SETDB1 KO line (RNA-seq coverage ≥1.0 aRPKM and SETDB1 KO aRPKM/TT2 aRPKM ≥10). The number of genes in each category is also shown. B. UCSC genome-browser screen shot of the 5′ end of the Akr1c21 gene, showing H3K9me3 NChIP-seq and RNA-seq tracks, alignment of the split paired-end RNA-seq reads in the locus and ERVs 5′ of the gene (see Figure S6).
Figure 7
Figure 7. Chimaeric transcripts initiating in LTR elements 5′ of genic TSSs and splicing to canonical genic exons are detected exclusively in the SETDB1 KO line
A. Genes with one paired-end read mapping to an annotated ERV and the other to a genic exon were identified. The top 20 genes in the SETDB1 KO, in terms of the number of chimaeric reads identified, are shown, along with RNA-seq coverage over genic exons. Annotation of the ERV in which transcription initiates, the orientation of the ERV in relation to the gene and the presence of SETDB1 or H3K9me3 in the ERV or at the 5′ end of the gene is also shown. Stars indicate the subclasses of ERVs that are broadly reactivated. B. The presence of chimaeric transcripts of the Akr1c21, Angptl6, Gm1110, Mep1b and Cyp2b23 genes was validated by RT-PCR using primers (arrows) designed within the 50 bp regions to which the chimaeric paired-end reads aligned. β-actin was used as a control. C. Amplicons were cloned and sequenced and the structure of the chimaeric RNAs, the orientation and subfamily of the ERV in which transcription initiates and the annotated genic TSS and exons (numbered) are shown for each locus. The sequence of the relevant novel splice donor (SD) and genic splice acceptor (SA) sites are also shown. For several genes, splicing to several genic exons was observed (see Figure S7).

Comment in

Similar articles

See all similar articles

Cited by 161 articles

See all "Cited by" articles

Publication types

MeSH terms

Associated data