Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 112 (47), E6456-65

Chromatin Extrusion Explains Key Features of Loop and Domain Formation in Wild-Type and Engineered Genomes

Affiliations

Chromatin Extrusion Explains Key Features of Loop and Domain Formation in Wild-Type and Engineered Genomes

Adrian L Sanborn et al. Proc Natl Acad Sci U S A.

Abstract

We recently used in situ Hi-C to create kilobase-resolution 3D maps of mammalian genomes. Here, we combine these maps with new Hi-C, microscopy, and genome-editing experiments to study the physical structure of chromatin fibers, domains, and loops. We find that the observed contact domains are inconsistent with the equilibrium state for an ordinary condensed polymer. Combining Hi-C data and novel mathematical theorems, we show that contact domains are also not consistent with a fractal globule. Instead, we use physical simulations to study two models of genome folding. In one, intermonomer attraction during polymer condensation leads to formation of an anisotropic "tension globule." In the other, CCCTC-binding factor (CTCF) and cohesin act together to extrude unknotted loops during interphase. Both models are consistent with the observed contact domains and with the observation that contact domains tend to form inside loops. However, the extrusion model explains a far wider array of observations, such as why loops tend not to overlap and why the CTCF-binding motifs at pairs of loop anchors lie in the convergent orientation. Finally, we perform 13 genome-editing experiments examining the effect of altering CTCF-binding sites on chromatin folding. The convergent rule correctly predicts the affected loops in every case. Moreover, the extrusion model accurately predicts in silico the 3D maps resulting from each experiment using only the location of CTCF-binding sites in the WT. Thus, we show that it is possible to disrupt, restore, and move loops and domains using targeted mutations as small as a single base pair.

Keywords: CRISPR; CTCF; chromatin loops; genome architecture; molecular dynamics.

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Chromatin is bendable at the kilobase scale. (A) In situ Hi-C maps DNA-DNA contacts occurring in intact nuclei. Reprinted with permission from ref. . (B) Probability that a restriction fragment will bend to form a cycle as a function of fragment length. Results are shown for four restriction enzymes. The 30-nm fiber predicts a peak around 30 kb (Right, yellow shading), whereas the 10-nm fiber is consistent with the peak observed around 1 kb (Left, yellow shading).
Fig. 2.
Fig. 2.
Contact domains exhibit a contact probability scaling with ɣ ≈ 0.75. (A, Left) Contact domains from a region on chromosome 4 of GM12878 lymphoblastoid cells. (A, Right) Number of contacts (Top) incident on a 50-kb window at the center of a domain (Bottom). (B) Contact probability vs. genomic distance for 473 individual domains, measured with respect to a 50-kb locus at the domain’s center. A power law (reference slope of −0.75, gray dashed line) is consistently observed inside domains, whose boundary is indicated by a vertical dashed line. A single black line shows contact probability for the domain from A. Domains are grouped by size; each group is vertically shifted by an order of magnitude for visual clarity. (Inset) Contact probability vs. genomic distance, excluding pairs of loci that lie in different contact domains. A power-law (ɣ = 0.76) is seen over two orders of magnitude. (C) Histogram of ɣ values for contact domains across six human cell types. (Inset) Representative microscopy images (maximum Z projections) of four cell types, showing chromatin (blue, DAPI stain) and cytoplasm (red, CellTracker CMTPX dye [Thermo Fisher]). (Scale bar: 10 μm.)
Fig. 3.
Fig. 3.
New mathematical theorem demonstrates that chromatin folding inside contact domains is not strictly fractal. (A) Successively applying a simple folding rule transforms a 1D line segment (Left) into a 2D Dragon curve (Right). Our theorem reveals that just as the curve doubles the dimension of the line segment, it doubles the dimension of all subsets of the line segment. Thus, when we intersect the Dragon curve with a line to create a 1D feature, the corresponding points in the original segment must have a Minkowski dimension of 1/2. A corollary of this theorem makes it possible to calculate the contact probability scaling exponent ɣ for any fractal curve. (B) Contact probability vs. distance for various structures. Our corollary predicts that ɣ for fractal curves satisfies ɣ = 2 − (dsurf/d), where dsurf is the dimension of the curve’s surface and d is the dimension of its interior. (Left) Comparing simulations (solid lines) with theoretical predictions (dashed lines). Two-dimensional Hilbert curve (purple; dsurf = 1, d = 2, ɣ = 1.5), 3D Hilbert curve (blue; dsurf = 2, d = 3, ɣ = 1.33), inside-out Hilbert curve, rank 3 (teal; dsurf = 1.5, d = 2, ɣ = 1.25), and fractal globule (green). As the rank of an inside-out Hilbert curve increases, its boundary becomes nearly 2D and its ɣ draws asymptotically close to 1. (Right, Top to Bottom) Three contact domains: chromosome 12: 46.2–46.4 Mb, chromosome 4: 21.8–22.4 Mb, and chromosome 5: 2.1–4.8 Mb.
Fig. 4.
Fig. 4.
Value of ɣ obtained using Hi-C is consistent with a tension globule in which loops form by diffusion. (A) Value of ɣ for a polymer after condensation varies as the ratio of internal to external forces, R, is changed. (B) Condensation of a 10-Mb tension globule over time. (C) Contact probability vs. distance for 450 simulated tension globules with a length of 10 Mb. In each case, a power law is seen for distances between 20 kb and 800 kb. Mean ɣ = 0.73, SD = 0.07. (D) Simulation of a region of chromosome 4 in GM12878 (chromosome 4: 20.3–22.6 Mb). The experimental data exhibit four loop domains (Left) and can be recapitulated (Right) using a tension globule containing four loops (black arcs) which is tethered at both ends (SI Appendix). Contact domains form spontaneously.
Fig. 5.
Fig. 5.
Model based on loop extrusion makes it possible to recapitulate Hi-C maps accurately using only CTCF ChIP-Seq results. (A, i and ii) Extrusion complex loads onto the fiber at a random locus, forming an extremely short-range loop. (A, iii) As the two subunits move in opposite directions along the fiber, the loop grows and the extruded fiber forms a domain. (A, iv) When a subunit detects a motif on the appropriate strand, it can stop sliding. Unlike diffusion, extrusion cannot mediate co-location of motifs on different chromosomes. (B) Three-dimensional rendering of a 3-Mb extrusion globule from the ensemble described below. Convergent CTCF anchors (orange spheres) lead to an unknotted loop spanning a compact, spatially segregated contact domain (highlighted in blue). (C) Contact probability vs. distance for 12 domains with a length of 1 Mb, created in silico using loop extrusion, measured from a 100-kb locus at the center of the domain. In each case, a power law is seen for distances between 5 kb and 400 kb. Mean ɣ = 0.72, SD = 0.06. (D) We use loop extrusion to model a 2.3-Mb region on chromosome 4 of GM12878. CTCF ChIP-Seq signals are normalized and converted into binding probabilities for the simulated extrusion complex. Each peak is assigned a forward (green) or reverse (red) orientation based on the strand of the underlying CTCF motif. Extrusion simulations yield an ensemble of 3D polymer configurations; contact maps for the simulated ensemble (Top) recapitulate the features observed in our kilobase-resolution Hi-C experiments (Bottom), including the position of domains and loops.
Fig. 6.
Fig. 6.
Analysis of loop networks reveals many isolated cliques, consistent with a model in which consecutive loops can form simultaneously by extrusion. Cliques of size three (Left), four (Middle), and five (Right), shown as network representations (Above) and in the Hi-C contact map (Below). Nodes correspond to loop anchor loci; edges and open circles indicate a loop called in (8) (solid green), or using a more relaxed threshold (dashed blue). The five-clique exhibits an additional loop (gray) connecting a clique locus to a locus outside the clique. The loop anchor CTCF motifs are indicated; each middle clique locus contains a pair of CTCF motifs in the divergent orientation.
Fig. 7.
Fig. 7.
Genome editing of CTCF motifs allows reengineering of loops in accordance with the convergent rule; the resulting contact maps can be predicted in silico using extrusion simulations. (A) Results of CRISPR/Cas9-based genome editing experiments at chr8:133.8–134.55 Mb in HAP1 cells. Extrusion simulations (Left) and experimental data (Right) are shown. (A, first row) Contact map for the WT locus, calculated using in silico simulations (Left), closely matches the map observed using Hi-C2 experiments (Right). (A, second row) Deletion of A/Forward eliminates the A-B and A-C loops and the contact domain boundary at locus A. The predictions of our in silico simulations (Left) closely match the contact map observed using Hi-C2 experiments (Right). All parameters in this and subsequent simulations of mutant regions use exactly the same parameters as the simulations of the corresponding WT contact map. The only difference in the mutant simulation is the modification of the appropriate CTCF-binding site (in this case, deletion of A/Forward). (A, third row) Deletion of B/Reverse eliminates the A-B loop. (A, fourth row) Deletion of B/Forward eliminates the B-C loop. (A, fifth row) Inversion of B/Forward eliminates the B-C loop. (A, sixth row) Simultaneous deletion of B/Reverse and inversion of B/Forward eliminates the B-C loop. (B) Similar series of results for chromosome 1 (180.3–181.3 Mb). Notably, the elimination of one loop anchor motif at the middle locus fails to eliminate either the D-E or E-F contact domain. When both loop anchor motifs are eliminated, both the D-E and E-F contact domains disappear. (C) We disrupted a forward CTCF motif by inserting a single base at chromosome 5: 31,581,788. Two loops are disrupted. The domain boundary moves to a nearby, weak CTCF site. Because the binding at this new site was weaker than the threshold value, this new boundary was not predicted by our extrusion simulations. (D) Our data suggest that the region shown in A is typically found in one of two states in wild-type cells. In the first state, both the A-B and B-C loop domains are present, but the A-C loop domain is absent. In the second, only the A-C loop domain is present. The data suggests a similar decomposition for the region in B. (E) Extrusion can explain the formation of exclusion domains. In this example, an extrusion complex forms a loop between adjacent motifs in the convergent orientation. Downstream, a second CTCF motif in the reverse orientation is unoccupied. Obstructed on both sides, extrusion complexes landing in the interval between the two reverse motifs tend to remain inside the interval. This leads to the formation of a domain.
Fig. 8.
Fig. 8.
We hypothesize that loops are formed during interphase by an extrusion mechanism comprising CTCF and cohesin. Here, we illustrate possible models for the extrusion complex. (A) In one model, the complex includes two DNA-binding subunits, each comprising a cohesin ring and a CTCF protein. When the complex is loaded onto DNA, a tiny loop forms. The two subunits engage the chromatin fiber in an antisymmetrical fashion, with their CTCF proteins facing the outside of the loop, scanning opposite DNA strands. The loop expands without knotting as the subunits slide in opposite directions. The interior of the loop forms a contact domain. When the CTCF proteins find a target motif on the appropriate strand, they can bind, arresting the progress of the subunit. Eventually, the extrusion complex dissociates. (B) In a second model, the sliding of cohesin alone leads to extrusion. Independently, CTCF proteins bind to their motif in an oriented fashion. When the cohesin ring encounters a CTCF protein, the extrusion process either continues or halts, depending on the orientation of CTCF. (C) Detailed view of the model in A. Other models are possible. Notably, it is unclear how many CTCF proteins and cohesin rings participate in a single extrusion complex, or whether the complex is part of a larger structure. All extrusion models predict that focal chromatin interactions mediated by CTCF must be intrachromosomal.

Comment in

Similar articles

See all similar articles

Cited by 348 PubMed Central articles

See all "Cited by" articles

Publication types

MeSH terms

Associated data

LinkOut - more resources

Feedback