Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Feb 5;9(1):487.
doi: 10.1038/s41467-017-02798-1.

Transcriptional Decomposition Reveals Active Chromatin Architectures and Cell Specific Regulatory Interactions

Affiliations
Free PMC article

Transcriptional Decomposition Reveals Active Chromatin Architectures and Cell Specific Regulatory Interactions

Sarah Rennie et al. Nat Commun. .
Free PMC article

Abstract

Transcriptional regulation is tightly coupled with chromosomal positioning and three-dimensional chromatin architecture. However, it is unclear what proportion of transcriptional activity is reflecting such organisation, how much can be informed by RNA expression alone and how this impacts disease. Here, we develop a computational transcriptional decomposition approach separating the proportion of expression associated with genome organisation from independent effects not directly related to genomic positioning. We show that positionally attributable expression accounts for a considerable proportion of total levels and is highly informative of topological associating domain activities and organisation, revealing boundaries and chromatin compartments. Furthermore, expression data alone accurately predict individual enhancer-promoter interactions, drawing features from expression strength, stabilities, insulation and distance. We characterise predictions in 76 human cell types, observing extensive sharing of domains, yet highly cell-type-specific enhancer-promoter interactions and strong enrichments in relevant trait-associated variants. Overall, our work demonstrates a close relationship between transcription and chromatin architecture.

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Fig. 1
Fig. 1
Transcriptional decomposition separates the proportion of RNA expression related to chromosomal position from positionally independent (PI) effects. a Schematic illustrating how RNA expression derives from two major sources. The positionally dependent (PD) component reflects the underlying dependency between linearly proximal TUs in chromosomal, positional neighbourhoods, which are related to chromatin neighbourhoods of TU three-dimensional proximity. The PI component reflects localised, gene-specific regulatory programs unaffected by the positioning of TUs. b Overall strategy of how replicated samples are decomposed into transcriptional components. Via approximate Bayesian modelling, normalised RNA expression count data quantified in genomic bins (here 10 kb), are decomposed into an intercept (α), a PI component and a PD component. The PD component is modelled as a first-order random walk, in which the difference between consecutive bins is assumed to be Normal and centred at 0 (Methods). The variable y represents the expression level, x represents the component value in bin i and τ represents the precision of a normally distributed random variable
Fig. 2
Fig. 2
Transcriptional decomposition across chromosomes. a PI and PD components (mean ± standard deviations), as well as H3K27me3 and H3K36me3 ChIP-seq data for GM12878, HeLa-S3 and HepG2 cells at locus chr1:145,000,000–180,000,000. b, c Loci (highlighted in a) around KCNN3 (b) and NOS1AP (c) genes showing cell-type-specific PD signals. The PD and PI signals and ChIP-seq data associated with repression (H3K27me3) and activation (H3K36me3) are shown
Fig. 3
Fig. 3
Transcriptional components reveal chromatin compartments and localised regulatory element-associated effects. a Box-and-whisker plot of GM12878 PD signal grouped according to HiC-derived chromatin compartments. The lower and upper hinges of boxes correspond to the first and third quartiles of data, respectively, and the whiskers extend to the largest and smallest data points no further away than 1.5 times the interquartile range. b Random forest class (presence/absence) probability of DNase1, H2AZ, H3K4me3 and H3K27ac as learned from the PI component (x axes) and the PD component (y axes). c PD difference (x axis, difference in PD value for log-transformed expression) versus false discovery rate (FDR)-adjusted p-value (rescaled by –log10) for PD component differential expression between GM12878 and HeLa-S3. Red represents significant bins unique to the PD component (corresponding to 473 bins), and blue represents those common to both (11 bins). d As c but for PI component (red: 282 significant bins unique to the PI component). e Expressed TF motif enrichment around expressed CAGE-derived promoters associated with GM12878 or HeLa-S3-biased differentially expressed PD or PI bins, versus all expressed CAGE-derived promoters. See Supplementary Figure 2e for all enriched expressed TFs
Fig. 4
Fig. 4
Expression-associated domains mark regions of active topological domains. a Approach for identifying boundaries of expression-associated domains (XADs) based on a PD boundary score. PD signal (mean ± standard deviations), PD stability (across cell PD standard deviation) and the XAD boundary score are shown. b Average GM12878 profiles of binarised ChIP-seq data for CTCF, CTCF in combination with Rad21 (cohesin), DNaseI, H3K36me3, H3K27me3 and H3K27ac at XAD boundaries with positive PD gradient (dark blue) and at random expressed bins with postive PD gradient (light blue). The vertical dotted line represents boundary locations and the horizontal dotted line represents background mean for a given mark. c Enrichment of GM12878 TAD boundaries among XAD boundaries compared to random bins proximal to expressed bins (DHS+ for DHS associated, DHS− for DHS non-associated). Error bars were derived from generating the random bin set 100 times and calculating the standard deviation of their enrichment versus the actual boundaries. d Venn diagram of association between GM12878 XAD boundaries and proximal (within five bins) DHS-associated TAD boundaries
Fig. 5
Fig. 5
Expression data are predictive of cell-type-specific regulatory interactions. a, b Performance curves for predicting bait–target interactions from CAGE-universal features, split according to CHICAGO score, testing negative-to-positive ratios and target feature type. c Features predictive of bait–target interactions, ordered by average mean decrease accuracy (MDA) across models from 10-fold cross-validation. d, e Loess curves representing feature separation over distance between high (top) and low (bottom) predicted probabilities, shown for features (d) XAD boundary insulation (nbounds, number of XAD boundaries between bins) and (e) enhancer expression at the target (eRNA_targ, mean tags per million across three replicates). f Overlaps of predicted bait–enhancer interactions between GM12878, HeLa-S3 and HepG2 cells. g An example of a loop predicted in HeLa-S3, but not in GM12878 cells, validated by HeLa-S3 RNAPII ChIA-PET interaction data
Fig. 6
Fig. 6
Transcriptional decomposition across 76 human cell types. ac Heat maps depicting pairwise similarities of the PD (a), PI (b) components and EP interactions (c) across cell types. All similarity scores were calculated between cell types using 1–L1 norm on binary data sets based on the sign of the PD, PI components or the presence/absence of EP interactions. Cell-type ordering is fixed according to the results of a hierarchical clustering (complete linkage) of the raw expression data. d Predicted probability of EP interactions averaged across groups of cells and CHiCAGO score of interaction based on GM12878 capture HiC data. GENCODE v24 transcripts, FANTOM5 enhancers and the average PD and PI component across blood cells are displayed below
Fig. 7
Fig. 7
Analysis of Crohn’s-disease associated SNPs reveals cell-type preferences and regulatory associations. a Crohn’s disease SNP enrichments in transcriptional components, XAD boundaries and inferred enhancer–promoter interactions (FDR-corrected χ2 tests based on the PD/PI positive bins or the presence of XAD boundaries/target enhancers per cell type, and trait-associated SNPs). See also Supplementary Figure 16 for enrichments in lymphoid leukaemia. Significance stars above bars are interpreted as: *P<0.1, **P<0.01 or ***P<0.001. b, c Predicted EP interactions in CD14+ monocytes treated with Cryptococcus, for genes PTGER4 (b) and TRIB1 (c) for which interacting enhancers overlap or are in close proximity with disease-associated SNPs (highlighted in yellow). Predicted EP interactions, GENCODE v24 transcripts, FANTOM5 enhancers and the locations of relevant SNPs are displayed below

Similar articles

See all similar articles

Cited by 14 articles

See all "Cited by" articles

References

    1. Gorkin DU, Leung D, Ren B. The 3D genome in transcriptional regulation and pluripotency. Cell Stem Cell. 2014;14:762–775. doi: 10.1016/j.stem.2014.05.017. - DOI - PMC - PubMed
    1. Pombo A, Dillon N. Three-dimensional genome architecture: players and mechanisms. Nat. Rev. Mol. Cell Biol. 2015;16:245–257. doi: 10.1038/nrm3965. - DOI - PubMed
    1. Lieberman-Aiden E, et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science. 2009;326:289–293. doi: 10.1126/science.1181369. - DOI - PMC - PubMed
    1. Osborne CS, et al. Active genes dynamically colocalize to shared sites of ongoing transcription. Nat. Genet. 2004;36:1065–1071. doi: 10.1038/ng1423. - DOI - PubMed
    1. Schoenfelder S, et al. Preferential associations between co-regulated genes reveal a transcriptional interactome in erythroid cells. Nat. Genet. 2010;42:53–61. doi: 10.1038/ng.496. - DOI - PMC - PubMed

Publication types

Feedback