Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul;583(7818):737-743.
doi: 10.1038/s41586-020-2151-x. Epub 2020 Jul 29.

Landscape of cohesin-mediated chromatin loops in the human genome

Affiliations
Free PMC article

Landscape of cohesin-mediated chromatin loops in the human genome

Fabian Grubert et al. Nature. 2020 Jul.
Free PMC article

Abstract

Physical interactions between distal regulatory elements have a key role in regulating gene expression, but the extent to which these interactions vary between cell types and contribute to cell-type-specific gene expression remains unclear. Here, to address these questions as part of phase III of the Encyclopedia of DNA Elements (ENCODE), we mapped cohesin-mediated chromatin loops, using chromatin interaction analysis by paired-end tag sequencing (ChIA-PET), and analysed gene expression in 24 diverse human cell types, including core ENCODE cell lines. Twenty-eight per cent of all chromatin loops vary across cell types; these variations modestly correlate with changes in gene expression and are effective at grouping cell types according to their tissue of origin. The connectivity of genes corresponds to different functional classes, with housekeeping genes having few contacts, and dosage-sensitive genes being more connected to enhancer elements. This atlas of chromatin loops complements the diverse maps of regulatory architecture that comprise the ENCODE Encyclopedia, and will help to support emerging analyses of genome structure and function.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Characteristics of cohesin-mediated chromatin interactions.
a, b, Cohesin ChIA-PET heat maps for the pan-cell-line data set. Signal tracks at the top and to the left of heat maps correspond to CTCF and RAD21 (cohesin) ChIP–seq signals and cohesin ChIA-PET loops (blue). a, Approximately 750-kb view including a contact domain (brown triangle) identified in lung fibroblasts (IMR90). b, Approximately 250-kb expanded view of contact domain (brown triangle). Dark blue squares, chromatin loops identified in our data set. For comparison, loops identified with in-situ Hi-C across eight cell lines are shown as squares in various colours. Heat maps were generated with Juicer and visualized with Juicebox. c, Sizes of cohesin-mediated chromatin loops identified in this study (n = 124,830) relative to TADs (n = 35,435), contact domains (n = 9,263), and high-resolution in situ Hi-C chromatin loops (n = 19,846). Centre line represents the median, box extent ranges from 25th to 75th percentile and whiskers extend at most to 1.5× the interquartile range. Summary statistics for the boxplots can be found in Supplementary Table 9. d, Per cent of Hi-C chromatin loops across seven cell lines (light blue) or contact domains from GM12878 (yellow) that overlap our pan-cell-line loop set. e, CTCF motif orientation at chromatin loop ends.
Fig. 2
Fig. 2. Chromatin loop variation across 24 cell types.
a, Examples of variable (left) and non-variable loops (right) across cell types. Chromatin loops are displayed above the corresponding RAD21 signal tracks. The colour density of loops corresponds to normalized interaction frequency (darker blue indicates higher frequency). *Isogenic cell types. b, PCA of normalized chromatin loop interaction frequencies (n = 85,294 loops versus n = 48 samples (24 cell types × 2 replicates each)). Colours denote the germ layer origin of each sample (Supplementary Table  2). c, Correlation of interaction frequencies between pairs of cell types (all types, n = 1,104 pairs; isogenic, n = 15; germ layer, n = 316; tissue, n = 160; biological replicates, n = 24; P values calculated using two-sided Wilcoxon rank-sum test). Centre line represents the median, box extent ranges from 25th to 75th percentile, and whiskers extend at most to 1.5× the interquartile range. Summary statistics for the boxplots can be found in Supplementary Table 9. d, Size distribution of variable chromatin loops versus two different sets of non-variable control loops (n = 35,698, significance assessed using two-sided t-test). Centre line represents the median, box extent ranges from 25th to 75th percentile, and whiskers extend at most to 1.5× the interquartile range. Summary statistics for the boxplots can be found in Supplementary Table 9. e, Overlap of variable and non-variable chromatin loops with contact domains. f, Enrichment of cell-type-specific genes and depletion of housekeeping genes (n = 2,220) in variable versus non-variable loops (n = 35,698). P values calculated using a two-sided Fisher’s exact test. Summary statistics for the figure can be found in Supplementary Table 9.
Fig. 3
Fig. 3. Cell-type-specific loops exhibit enrichment for specific chromatin states.
a, Enrichment of domain boundaries, enhancers and promoters relative to connectivity of loop ends (number of interactions assessed, 124,830; *P < 0.05, **P < 0.005, ***P < 2.2 × 10−16; NS, not significant (P = 0.67); significance assessed by two-sided Fisher’s exact test). Summary statistics for the enrichment calculations can be found in Supplementary Table 9. b, Examples of cell-type-specific active enhancers and chromatin loops across three cell types. Chromatin loops are displayed above the corresponding H3K27ac signal tracks. Loop colour intensity corresponds to interaction frequency. Below the H3K27ac track is the chromatin state annotation obtained from the Roadmap Epigenomics Mapping Consortium. H1-hESC cells have minimal enhancer activity and few loops. Cohesin loops colocalize with regions of high enhancer activity in GM12878 and MSFib cells. c, Proportion of chromatin states in cell-type-specific loop ends for a lymphoblastoid cell line (GM12878), an embryonic line (H1-hESC) and a skin-derived fibroblast line (MSFib). d, Fold-enrichments of chromatin states at cell-type-specific loop ends in GM12878, H1-hESC and MSFib cells. Number of interactions assessed (top 10%) = 8,529; *P < 0.05, **P < 0.005, ***P < 2.2 × 10−16; NS, not significant; P values assessed by two-sided Fisher’s exact test and adjusted for multiple hypothesis testing using the Benjamini–Hochberg procedure. See Supplementary Table 9 for a complete list of enrichments and P values. e, Fold-enrichments of cell-type-specific loops linking cell-type-specific enhancer–enhancer pairs (ENH–ENH; mean = 2.57), enhancer–promoter pairs (ENH–TSS; mean = 2.18) and promoter–promoter pairs (TSS–TSS; mean = 1.47) (n = 21 cell types; error bars, s.d.). f, Normalized expression level for each gene, binned by the number of cell-type-specific enhancer connections per gene (P values assessed by two-sided Wilcoxon sum-rank test; centre line represents the median, box extent ranges from 25th to 75th percentile and whiskers extend at most to 1.5× the interquartile range. Summary statistics for the boxplots can be found in Supplementary Table 9. gj, log2[odds ratios] for haploinsufficient genes (g), disease genes in ClinVar (h), and housekeeping genes (i) that have a certain number of enhancers linked to their promoters. *P < 0.05, two-sided Fisher’s exact test with Benjamini–Hochberg adjustment for multiple hypothesis testing; n = 19,353 chromatin loops. See Supplementary Table 9 for a complete list of P values.
Fig. 4
Fig. 4. Variable chromatin loops correspond to changes in gene expression levels and alternative splicing.
a, Example of chromatin loop changes with accompanying changes in gene expression and chromatin activity. Red arrow indicates a loop that links an active H3K27ac site to the promoter of MTDH. b, Pearson correlation between loop interaction frequencies and expression levels for the enhancer loop and MTDH (red arrow in a); sample size, 46 (23 cell types × 2 replicates). c, Spearman rank correlation (absolute value) between loop interaction frequency and gene expression levels for all loop–gene pairs versus randomized loop-gene pairs (n = 90,657, P < 2.2 × 10−16, two-sided Wilcoxon rank-sum test). d, Schematic of various ways to map loops to genes. e, Spearman rank correlation (absolute value) between loop interaction frequency and gene expression for different groups of loop–gene pairs. f, Fold-enrichment for positive correlation between loop frequency and gene expression levels for different groups of loop–gene pairs (all, n = 90,655 pairs; promoter, n = 18,628; contained, n = 40,719; enhancer–promoter, n = 4,421). Odds ratio ± 95% confidence interval (CI); ***P < 2.2 × 10−16; two-sided Fisher’s exact test. g, Median Spearman rank correlation of expression levels between distance-matched pairs of genes that are not located in the same loop (blue), are located in the same loop (red), or are located in the same variable loop (green). h, Pearson correlation of the normalized ChIA-PET anchor counts and the exon counts across all cell types for exon–loop pairs (red, n = 277), exon–loops pairs of the same gene (blue, n = 1,347) and 100 permutations of the exon associated to the anchor (grey, n = 27,700). Red versus blue, P = 9.2 × 10−3; red versus grey, P = 8.90 × 10−193). i, Scatterplots of the DUE and anchor counts for real pairs (blue, n = 111) and other exon–loop pairs within the same gene (pink, n = 1,347). j, Example of an intragenic loop that affects exon inclusion for gene ARHGEF7. Exon 6 (yellow) is included in the blood-specific group, but not in the stem-like/embryonic or solid groups.
Fig. 5
Fig. 5. Characterization of group-specific loops.
a, Fold-enrichment of 598 TF motifs in blood-specific chromatin loop ends (n = 3,384). Significance assessed using two-sided Fisher’s exact test with Benjamini–Hochberg correction for multiple hypothesis testing. Top hits are highlighted in red; complete enrichment results are provided in Supplementary Table 9. bd, Chromatin accessibility determined by ATAC–seq at blood-specific loop anchors centred at the motif instances for SPIB, SPI and TCF3. e, Biological processes associated with blood-specific chromatin loops (n = 3,384). Enrichment was assessed using the GREAT tool. f, Enrichment of disease-specific GWAS SNPs (n = 86 diseases) in blood-specific loop ends (n = 3,384) assessed by a P value permutation test. HDL, HDL cholesterol; LDL, LDL cholesterol; HT, hypertension; SBP, systolic blood pressure; DBP, diastolic blood pressure; BMI, body mass index; UACR, urinary albumin–creatinine ratio; ALS, amyotrophic lateral sclerosis; AD, Alzheimer’s disease. g, Association of blood-specific chromatin loop anchors (n = 3,384) with GWAS traits observed by partitioned LD score regression using a common set of 47 traits (n = 1,100,000 HapMap3 SNPs, block jackknife t-test; mean ± s.d.).
Extended Data Fig. 1
Extended Data Fig. 1. Quality metrics and convergence rates.
a, Flowchart of study design. b, Total number of reads obtained for each ChIA-PET sample c, Relative strand correlation (RSC) score for RAD21 ChIA-PET data. d, RSC for H3K27ac ChIP–seq data. eg, Comparison of CTCF motif presence and orientation at chromatin loops identified in this study and other published data sets. e, Fraction of chromatin loops with exactly one CTCF motif at both loop ends. f, Fraction of chromatin loops with at least one CTCF motif at both loop ends. g, Fraction of chromatin loops with convergent CTCF motif orientation.
Extended Data Fig. 2
Extended Data Fig. 2. Variability in chromatin loops.
a, b, PCA of H3K27ac ChIP–seq data (a; n = 288,711 peaks versus n = 44 samples (22 cell types × 2 replicates each)) and RNA-seq data (b; n = 22,197 genes versus n = 46 samples (23 cell types × 2 replicates each)); samples are coloured according to the germ layer from which they originated. c, PCA of chromatin loop interaction frequencies (n = 85,294 loops versus n = 48 samples (24 cell types × 2 replicates each)). Colours denote the experimental batch of each sample. d, GC content in anchor regions of different sets of chromatin loops. Centre line represents the median, box extent ranges from 25th to 75th percentile and whiskers extend at most to 1.5× the interquartile range. Summary statistics for the boxplots can be found in Supplementary Table 9. e, Correlation of chromatin loop interaction frequencies (Spearman rank correlation; y-axis) between pairs of cell types at varying PET thresholds (x-axis). f, Number of variable loops found at different FDR thresholds. g, Size distribution of variable chromatin loops versus non-variable loops at different FDR cutoffs. ***P < 0.04. Significance was assessed using a two-sided t-test. Centre line represents the median, box extent ranges from 25th to 75th percentile, whiskers extend at most to 1.5× the interquartile range. Summary statistics for the boxplots can be found in Supplementary Table 9. h, Variability of loops of different sizes. Summary statistics for all box plots can be found in Supplementary Table 9. i, Enrichment of cell-type-specific genes and depletion of housekeeping genes (n = 2,220) in variable versus non-variable loops (n = 35,698) for null sets 1 and 2. P values by Fisher’s exact test. j, Enrichment of broadly expressed genes at variable and non-variable chromatin loops. The set of broadly expressed genes was obtained from the GTEx project,.
Extended Data Fig. 3
Extended Data Fig. 3. Cell-type-specific loops exhibit enrichment for specific chromatin states.
a, Proportion of chromatin states in cell-type-specific loop ends for various cell types from the blood group (red), the stem cell/embryonic group (purple) and the group derived from solid tissue (black). b, Fold-enrichment of chromatin states in cell-type-specific loop ends for the cell types in a. Number of interactions assessed (top 10%) = 8,529; *P < 0.05, **P < 0.005, ***P < 2.2 × 10−16, n.s. = non-significant; P values assessed by two-sided Fisher’s exact test and corrected for multiple hypothesis testing using the Benjamini–Hochberg procedure. See Supplementary Table 9 for a complete list of enrichments and P values.
Extended Data Fig. 4
Extended Data Fig. 4. Connectivity of genes corresponds to gene function.
ad, log2 odds ratios for different groups of genes with a certain number of loops linked to their promoters (*adjusted P < 0.05 by two-sided Fisher’s exact test; n = 19,353 loops). See Supplementary Table 9 for a complete list of P values. a, Haploinsufficient genes; b, genes in GWAS catalogue; c, disease genes in ClinVar; d, housekeeping genes. eh, log2 odds ratios for each cell type shown for genes identified as haploinsufficient (e), gene in GWAS catalogue (f), disease genes in ClinVar (g) or housekeeping genes (h) and having at least a given number of loops ending at its promoter (*adjusted P < 0.05 by two-sided Fisher’s exact test). See Supplementary Table 9 for a complete list of enrichments and P values.
Extended Data Fig. 5
Extended Data Fig. 5. Chromatin loops are associated with alternative splicing across cell types.
a, b, Distribution of the distance (bp) between the centre of the loop anchors and the TSS (a) or the exon 5′ boundary (b). c, DEXSeq plot showing the differential exon usage of all exons for gene ARHGEF7, highlighting exon 6, which is affected by an intragenic loop in the blood cell types. d, Scatterplot of the normalized counts of exon 6 in ARHGEF7 with respect to the log2-transformed fold change in loop strength for all cell types (n = 44 (22 cell types × 2 biological replicates); Pearson correlation, 0.49).
Extended Data Fig. 6
Extended Data Fig. 6. Transcription factor analysis, GO enrichments and GWAS for embryonic-specific loops.
a, Fold-enrichments of 598 TF motifs in chromatin loop ends that are embryonic-specific (n = 2,894). Significance was assessed using two-sided Fisher’s exact test. P values were adjusted for multiple hypothesis testing using the Benjamini–Hochberg procedure. b, Biological processes associated with embryonic-specific chromatin loops (n = 2,894). Enrichment was performed using GREAT. c, Enrichment of disease-specific GWAS SNPs in embryonic-specific loop ends. MDD, major depressive disorder; BPD, bipolar disorder; HOMAIR, homeostatic model assessment for insulin resistance. d, Association of embryonic-specific chromatin loop anchors (n = 2,894) with GWAS traits observed by partitioned LDSC using a common set of 47 traits; (n = 1,100,000 HapMap3 SNPs, block jackknife t-test, mean ± s.d.).
Extended Data Fig. 7
Extended Data Fig. 7. Association of chromatin loops with GWAS traits.
ah, Association of blood-specific (ad) and embryonic-specific (eh) chromatin loop anchors with GWAS traits observed by partitioned LDSC using a common set of 47 traits. (n = 1,100,000 HapMap3 SNPs, block jackknife t-test, centre values indicate the mean ± s.d.). Within each panel: left, all blood-specific loops (ad) or embryonic-specific loops (eh); right, set of loops that does not overlap with super enhancers. All panels adjusted for the set of baseline line traits as previously described; in addition, b, f are adjusted for all RAD21 loops; c, g are adjusted for super-enhancers across all cell types, within blood-specific and embryonic–specific loops; d, h are adjusted for cell-group-specific signal and global Roadmap annotation.

Similar articles

Cited by

References

    1. Lieberman-Aiden E, et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science. 2009;326:289–293. - PMC - PubMed
    1. Fullwood MJ, et al. An oestrogen-receptor-α-bound human chromatin interactome. Nature. 2009;462:58–64. - PMC - PubMed
    1. Dixon JR, et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012;485:376–380. - PMC - PubMed
    1. Nora EP, et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature. 2012;485:381–385. - PMC - PubMed
    1. Sexton T, et al. Three-dimensional folding and functional organization principles of the Drosophila genome. Cell. 2012;148:458–472. - PubMed

Publication types