Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 326 (5950), 289-93

Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome

Affiliations

Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome

Erez Lieberman-Aiden et al. Science.

Abstract

We describe Hi-C, a method that probes the three-dimensional architecture of whole genomes by coupling proximity-based ligation with massively parallel sequencing. We constructed spatial proximity maps of the human genome with Hi-C at a resolution of 1 megabase. These maps confirm the presence of chromosome territories and the spatial proximity of small, gene-rich chromosomes. We identified an additional level of genome organization that is characterized by the spatial segregation of open and closed chromatin to form two genome-wide compartments. At the megabase scale, the chromatin conformation is consistent with a fractal globule, a knot-free, polymer conformation that enables maximally dense packing while preserving the ability to easily fold and unfold any genomic locus. The fractal globule is distinct from the more commonly used globular equilibrium model. Our results demonstrate the power of Hi-C to map the dynamic conformations of whole genomes.

Figures

Fig. 1
Fig. 1
Overview of Hi-C. (A) Cells are cross-linked with formaldehyde, resulting in covalent links between spatially adjacent chromatin segments (DNA fragments: dark blue, red; Proteins, which can mediate such interactions, are shown in light blue and cyan). Chromatin is digested with a restriction enzyme (here, HindIII; restriction site: dashed line, see inset) and the resulting sticky ends are filled in with nucleotides, one of which is biotinylated (purple dot). Ligation is performed under extremely dilute conditions to create chimeric molecules; the HindIII site is lost and a NheI site is created (inset). DNA is purified and sheared. Biotinylated junctions are isolated with streptavidin beads and identified by paired-end sequencing. (B) Hi-C produces a genome-wide contact matrix. The submatrix shown here corresponds to intrachromosomal interactions on chromosome 14. Each pixel represents all interactions between a 1Mb locus and another 1Mb locus; intensity corresponds to the total number of reads (0-50). Tick marks appear every 10Mb. (C, D) We compared the original experiment to a biological repeat using the same restriction enzyme (C, range: 0-50 reads) and to results with a different restriction enzyme (D, range: 0- 100 reads, NcoI).
Fig. 2
Fig. 2
The presence and organization of chromosome territories. (A) Probability of contact decreases as a function of genomic distance on chromosome 1, eventually reaching a plateau at ~90M (blue). The level of interchromosomal contact (black dashes) differs for different pairs of chromosomes; loci on chromosome 1 are most likely to interact with loci on chromosome 10 (green dashes) and least likely to interact with loci on chromosome 21 (red dashes). Interchromosomal interactions are depleted relative to intrachromosomal interactions. (B) Observed/expected number of interchromosomal contacts between all pairs of chromosomes. Red indicates enrichment, and blue indicates depletion (up to twofold). Small, gene-rich chromosomes tend to interact more with one another.
Fig. 3
Fig. 3
The nucleus is segregated into two compartments corresponding to open and closed chromatin. (A) Map of chromosome 14 at a resolution of 1Mb (1 tick mark = 10Mb) exhibits substructure in the form of an intense diagonal and a constellation of large blocks (three experiments combined, range: 0-200 reads). The Observed/expected matrix (B) shows loci with either more (red) or less (blue) interactions than would be expected given their genomic distance (range: 0.2 – 5). Correlation matrix (C) illustrates the correlation (red: 1, blue: −1) between the intrachromosomal interaction profiles of every pair of 1 Mb loci along chromosome 14. The plaid pattern indicates the presence of two compartments within the chromosome. (D) Interchromosomal correlation map for chromosome 14 and chromosome 20 (red: 0.25, blue: 0.25). The unalignable region around the centromere of chromosome 20 is indicated in grey. Each compartment on chromosome 14 has a counterpart on chromosome 20 with a very similar genome-wide interaction pattern. (E,F) We designed probes for four loci (L1, L2, L3, and L4) that lie consecutively along Chromosome 14 but alternate between the two compartments (L1, L3 in A; L2, L4 in B). (E) L3 (blue) was consistently closer to L1 (green) than to L2 (red), despite the fact that L2 lies between L1 and L3 in the primary sequence of the genome. This was confirmed visually and by plotting the cumulative distribution. (F) L2 (red) was consistently closer to L4 (green) than to L3 (blue). (G) Correlation map of chromosome 14 at a resolution of 100kb. The principal component (eigenvector) correlates with the distribution of genes and with features of open chromatin. (H) A 31Mb window from the chromosome 14 is shown; the indicated region (yellow dashes) alternates between the open and closed in compartment in GM06990 (top, eigenvector and heatmap), but is predominantly open in K562 (bottom, eigenvector and heatmap). The change in compartmentalization corresponds to a shift in chromatin state (DNAseI).
Fig. 4
Fig. 4
The local packing of chromatin is consistent with the behavior of a fractal globule. (A) Contact probability as a function of genomic distance, averaged across the genome (blue) shows a power law scaling between 500kb and 7Mb (shaded region) with a slope of −1.08 (fit shown in cyan). (B) Simulation results for contact probability as a function of distance (1 monomer~6 nucleosomes~1200 bp, SOM) for equilibrium (red) and fractal (blue) globules. The slope for a fractal globule is very nearly −1 (cyan), confirming our prediction (SOM). The slope for an equilibrium globule is −3/2, matching prior theoretical expectations. The slope for the fractal globule closely resembles the slope we observed in the genome. (C) Top: An unfolded polymer chain, 4000 monomers (4.8 Mb) long. Coloration corresponds to distance from one endpoint, ranging from blue to cyan, green, yellow, orange, and red. Middle: An equilibrium globule. The structure is highly entangled; loci that are nearby along the contour (similar color) need not be nearby in 3D. Bottom: A fractal globule. Nearby loci along the contour tend to be nearby in 3D, leading to monochromatic blocks both on the surface and in cross-section. The structure lacks knots. (D) Genome architecture at three scales. Top: Two compartments, corresponding to open and closed chromatin, spatially partition the genome. Chromosomes (blue, cyan, green) occupy distinct territories. Middle: Individual chromosomes weave back-and-forth between the open and closed chromatin compartments. Bottom: At the scale of single megabases, the chromosome consists of a series of fractal globules.

Similar articles

See all similar articles

Cited by 1,998 PubMed Central articles

See all "Cited by" articles

Publication types

Associated data

Feedback