Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 159 (7), 1665-80

A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping

Affiliations

A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping

Suhas S P Rao et al. Cell.

Erratum in

  • Cell. 2015 Jul 30;162(3):687-8

Abstract

We use in situ Hi-C to probe the 3D architecture of genomes, constructing haploid and diploid maps of nine cell types. The densest, in human lymphoblastoid cells, contains 4.9 billion contacts, achieving 1 kb resolution. We find that genomes are partitioned into contact domains (median length, 185 kb), which are associated with distinct patterns of histone marks and segregate into six subcompartments. We identify ∼10,000 loops. These loops frequently link promoters and enhancers, correlate with gene activation, and show conservation across cell types and species. Loop anchors typically occur at domain boundaries and bind CTCF. CTCF sites at loop anchors occur predominantly (>90%) in a convergent orientation, with the asymmetric motifs "facing" one another. The inactive X chromosome splits into two massive domains and contains large loops anchored at CTCF-binding repeats.

Figures

Fig. 1
Fig. 1. We used in situ Hi-C to map over 15 billion chromatin contacts across nine cell types in human and mouse, achieving 1 kilobase resolution in human lymphoblastoid cells
(A) During in situ Hi-C, DNA-DNA proximity ligation is performed in intact nuclei. (B) Contact matrices from chromosome 14: the whole chromosome, at 500Kb resolution (top); 86–96Mb/50Kb resolution (middle); 94–95Mb/5Kb resolution (bottom). Left: GM12878, primary experiment; Right: replicate. The 1D regions corresponding to a contact matrix are indicated in the diagrams above and at left. The intensity of each pixel represents the normalized number of contacts between a pair of loci. Maximum intensity is indicated in the lower left of each panel. (C) We compare our map of chromosome 7 in GM12878 (last column) to earlier Hi-C maps: Lieberman-Aiden et al., Kalhor et al., and Jin et al. (D) Mean contacts per pixel vs distance, at various resolutions, compared to published Hi-C experiments (dashed line = 10). See also Data S1, Table S1 and Table S2.
Fig. 2
Fig. 2. The genome is partitioned into domains that segregate into nuclear subcompartments, corresponding to different patterns of histone modifications
(A) We annotate thousands of domains across the genome (left, black highlight). To do so, we define an arrowhead matrix A (right) such that Ai,i+d = (M*i,i−dM*i,i+d)/(M*i,i−d + M*i,i+d), where M* is the normalized contact matrix. This transformation replaces domains with an arrowhead-shaped motif pointing towards the domain’s upper-left corner (example in yellow). The arrowhead size corresponds to the domain size. Using dynamic programming, this transformation allows us to efficiently compute a “corner score” for each pixel in a Hi-C matrix, indicating the likelihood that the pixel lies at the upper-right corner of a domain. See Experimental Procedures. (B) Pearson correlation matrices of the histone mark signal between pairs of loci inside, and within 100Kb of, a domain. Left: H3K36me3; Right: H3K27me3. (C) Conserved domains on chromosome 3 in GM12878 (left) and IMR90 (right). In GM12878, the highlighted domain (gray) is enriched for H3K27me3 and depleted for H3K36me3. In IMR90, the situation is reversed. Marks at flanking domains are the same in both: the domain to the left is enriched for H3K36me3 and the domain to the right is enriched for H3K27me3. The flanking domains have long-range contact patterns which differ from one another and are preserved in both cell types. In IMR90, the central domain is marked by H3K36me3 and its long-range contact pattern matches the similarly-marked domain on the left. In GM12878, it is decorated with H3K27me3, and the long-range pattern switches, matching the similarly-marked domain to the right. Diagonal submatrices, 10Kb resolution; long-range interaction matrices, 50Kb resolution. (D) Each of the six long-range contact patterns we observe exhibits a distinct epigenetic profile. All epigenetic data is from ENCODE experiments in GM12878 except nuclear lamin (derived from skin fibroblast cells) and NAD (HeLa). See Table S8. Each subcompartment also has a visually distinctive contact pattern. (E) Each example shows part of the long-range contact patterns for several nearby genomic intervals lying in different compartments. (F) A large contiguous region on chromosome 19 contains intervals in subcompartments A1, B1, B2, and B4. See also Data S2 and Data S3.
Fig. 3
Fig. 3. We identify thousands of chromatin loops genome-wide using a local background model
(A) We identify peaks by detecting pixels that are enriched with respect to four local neighborhoods (blowout): horizontal (blue), vertical (green), lower-left (yellow), and donut (black). These “peak” pixels are marked with blue circles (radius=20Kb) in the lower-left of each heatmap. The number of raw contacts at each peak is indicated. Left: primary GM12878 map; Right: replicate; annotations are completely independent. All contact matrices in these figures are 10Kb resolution unless noted. (B) Overlap between replicates. (C) (Top) Location of 3D-FISH probes (Bottom) Example cell. (D) APA plot shows the aggregate signal from the 9948 GM12878 loops we report by summing submatrices surrounding each peak in a low-resolution GM12878 Hi-C map due to Kalhor et al. See also Figure S4, Table S3, Table S4, and Table S5.
Fig. 4
Fig. 4. Loops are often preserved across cell types and from human to mouse
(A) Examples of peak and domain preservation across cell types. Annotated peaks are circled in blue. All annotations are completely independent. (B) Of the 3331 loops we annotate in mouse CH12-LX, 1649 (50%) are orthologous to loops in human GM12878. (C–E) Conservation of three-dimensional structure in synteny blocks. See also Figure S5.
Fig. 5
Fig. 5. Loops between promoters and enhancers are strongly associated with gene activation
(A) Histogram showing loop count at promoters (left); restricted to loops where the distal peak locus contains an enhancer (right). (B) Genes whose promoters participate in a loop in GM12878 but not in a second cell type are frequently upregulated in GM12878, and vice-versa. (C) Left: a loop in GM12878, with one anchor at the SELL promoter and the other at a distal enhancer. The gene is on. Right: The loop is absent in IMR90, where the gene is off. (D) Left: Two loops in GM12878 are anchored at the promoter of the inactive ADAMTS1 gene. Right: A series of loops and domains appear, along with evident transitive looping. ADAMTS1 is on. See also Figure S5 and Table S6.
Fig. 6
Fig. 6. Many loops demarcate domains; the vast majority of loops are anchored at a pair of convergent CTCF/RAD21/SMC3 binding sites
(A) Histograms of corner score for peak pixels vs. random pixels with an identical distance distribution. (B) Contact matrix for chr4:20.55Mb-22.55Mb in GM12878, showing examples of transitive and intransitive looping behavior. (C) % of peak loci bound vs. fold enrichment for 76 DNA-binding proteins. (D) The pairs of CTCF motifs that anchor a loop are nearly all found in the convergent orientation. (E) A peak on chromosome 1 and corresponding ChIP-Seq tracks. Both peak loci contain a single site bound by CTCF, RAD21, and SMC3. The CTCF motifs at the anchors exhibit a convergent orientation. See also Figure S6.
Fig. 7
Fig. 7. Diploid Hi-C maps reveal superdomains and superloops anchored at CTCF-binding repeats on the inactive X chromosome
(A) The frequency of mismatch (maternal-paternal) in SNP allele assignment vs distance between two paired read alignments. Intrachromosomal read pairs are overwhelmingly intramolecular. (B) Preferential interactions between homologs. Left/top is maternal; right/bottom is paternal. The aberrant contact frequency between 6p and 11p (circle) reveals a translocation. (C) Top: In our unphased Hi-C map of GM12878, we observe two loops joining both the promoter of the maternally-expressed H19 and the promoter of the paternally-expressed Igf2 to a distal locus, HIDAD. Using diploid Hi-C maps, we phase these loops: the HIDAD-H19 loop is present only on the maternal homolog (left) and the HIDAD-Igf2 loop is present only on the paternal homolog (right). (D) The inactive (paternal) copy of chromosome X (bottom) is partitioned into two massive “superdomains” not seen in the active (maternal) copy (top). DXZ4 lies at the boundary. (E) The “superloop” between FIRRE and DXZ4 is present in the GM12878 haploid map (top), in the paternal GM12878 map (middle right), and in the map of the female cell line IMR90 (bottom right); it is absent from the maternal GM12878 map (middle left) and the map of the male HUVEC cell line (bottom left). See also Figure S7 and Table S7.

Similar articles

  • Chromatin Extrusion Explains Key Features of Loop and Domain Formation in Wild-Type and Engineered Genomes
    AL Sanborn et al. Proc Natl Acad Sci U S A 112 (47), E6456-65. PMID 26499245.
    We recently used in situ Hi-C to create kilobase-resolution 3D maps of mammalian genomes. Here, we combine these maps with new Hi-C, microscopy, and genome-editing experi …
  • Cohesin Loss Eliminates All Loop Domains
    SSP Rao et al. Cell 171 (2), 305-320.e24. PMID 28985562.
    The human genome folds to create thousands of intervals, called "contact domains," that exhibit enhanced contact frequency within themselves. "Loop domains" form because …
  • CTCF Binding Polarity Determines Chromatin Looping
    E de Wit et al. Mol Cell 60 (4), 676-84. PMID 26527277.
    CCCTC-binding factor (CTCF) is an architectural protein involved in the three-dimensional (3D) organization of chromatin. In this study, we assayed the 3D genomic contact …
  • A (3D-Nuclear) Space Odyssey: Making Sense of Hi-C Maps
    I Mota-Gómez et al. Genes (Basel) 10 (6). PMID 31146487. - Review
    Three-dimensional (3D)-chromatin organization is critical for proper enhancer-promoter communication and, therefore, for a precise execution of the transcriptional …
  • What's in the "Fold"?
    P Mehra et al. Life Sci 211, 118-125. PMID 30213728. - Review
    Complexity in genome architecture determines how gene expression programs are established, maintained, and modified from early developmental stages to normal adult phenot …
See all similar articles

Cited by 1,306 PubMed Central articles

See all "Cited by" articles

Publication types

Associated data

LinkOut - more resources

Feedback