Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jan;18(1):107-14.
doi: 10.1038/nsmb.1936. Epub 2010 Dec 5.

The Three-Dimensional Folding of the α-Globin Gene Domain Reveals Formation of Chromatin Globules

Affiliations
Free PMC article

The Three-Dimensional Folding of the α-Globin Gene Domain Reveals Formation of Chromatin Globules

Davide Baù et al. Nat Struct Mol Biol. .
Free PMC article

Abstract

We developed a general approach that combines chromosome conformation capture carbon copy (5C) with the Integrated Modeling Platform (IMP) to generate high-resolution three-dimensional models of chromatin at the megabase scale. We applied this approach to the ENm008 domain on human chromosome 16, containing the α-globin locus, which is expressed in K562 cells and silenced in lymphoblastoid cells (GM12878). The models accurately reproduce the known looping interactions between the α-globin genes and their distal regulatory elements. Further, we find using our approach that the domain folds into a single globular conformation in GM12878 cells, whereas two globules are formed in K562 cells. The central cores of these globules are enriched for transcribed genes, whereas nontranscribed chromatin is more peripheral. We propose that globule formation represents a higher-order folding state related to clustering of transcribed genes around shared transcription machineries, as previously observed by microscopy.

Figures

Figure 1
Figure 1
ENCODE region ENm008 on human chromosome 16. (a) Map of ENm008 including the ζ, μ, α2, α1, and θ globin genes. Genes are indicated by grey lines above the linear representation. Vertical black lines indicate HindIII restriction sites. Colored restriction fragments contain annotated genes. Red, orange and green circles localize the HS40, other α-globin related HS sites and CTCF sites, respectively. (b) ENCODE annotations for the ENm008 region. RNA expression data, CTCF data, Histone modification data (H3K4me3) and DNAse I sensitivity data , are generated by the ENCODE project (http://genome.ucsc.edu/ENCODE/).
Figure 2
Figure 2
5C analysis of the 500 Kb ENCODE region ENm008. (a) 5C experimental data for GM12878 cell lines. Upper plot shows 5C count matrix colored yellow to blue to indicate low to high counts. For an easy inspection, the axis labels are substituted by the linear representation of the forward and reverse fragments of the ENm008 region. Lower plots show 5C interaction profiles for fragments containing HS48, HS46, HS40, HBM, HBA2, HBA1, and 3’ end of LUC7L, respectively. The plots show the 5C counts and their associated standard error of interactions between the anchor fragment (indicated by vertical arrows) and the rest of queried fragments in the ENm008 region (colored bars indicate the positions of HS elements (red), globin genes (green) and LUC7L gene (blue)). Blue solid lines show the average and standard error expected relationship between interaction frequency (5C counts) and genomic distance (Kb) determined by LOESS smoothing of the complete dataset (Supplementary Fig. 1). Red circles show the observed 5C counts for each of the queried fragments. (b) 5C experimental data for K562 cell lines. Data are represented as in panel a.
Figure 3
Figure 3
Ensemble of solutions. (a) Cluster analysis for the GM12878 selected 10,000 models. Upper plot shows the number of models per cluster plotted against the cluster number. Points are colored proportional to the lowest IMP objective function in the cluster. IMP mirroring is illustrated by the superimposition of the centroids (i.e., the solution closest to the center of the cluster) of clusters one (red) and two (blue). Lower plot shows the structural relationship between the top cluster centroids. The tree was generated based on the structural similarity between each of the centroids. The branch thickness is proportional to the number of solutions at each branch point. Each centroid, colored as in its linear representation (Fig. 1a), is vertically placed proportional to the lowest IMP objective function within the cluster their represent. (b) Cluster analysis for the K562 selected 10,000 models. Data are represented as in panel a. (c) Model consistency for the ensemble of solutions in cluster 1 of GM12878 models (blue) and cluster 2 of K562 models (red).
Figure 4
Figure 4
3D models of the ENm008 ENCODE region containing the α-globin locus. (a) 3D structure of the GM12878 models represented by the centroid of cluster number 1. The 3D model is colored as in its linear representation (Fig. 1a). Regulatory elements are represented as spheres colored in red (HS40), orange (other HSs), and green (CTCFs). (b) 3D structure of the K562 models represented by the centroid of cluster number 2. Data are represented as in panel a. (c) Distances between the α-globin genes (restriction fragments 31–32) and other restriction fragments in ENm008. The plot shows the distribution and standard deviation of the mean of distances for GM12878 models in cluster 1 (blue) and K562 models in cluster 2 (red). (d) Average distances and their standard error between a pair of loci located on either end of the ENM008 domain as determined by FISH with two fosmid probes (see Methods) and from a 2D representation of the IMP-generated models in both cell lines. (e) Example images obtained with FISH of GM12878 and K562 cell lines. The images show smaller distances between the probes in GM12878 than in K562 cell lines.
Figure 5
Figure 5
Analysis of chromatin globules. (a) Frequency contact map differences between models in cluster 1 of GM12878 cells and cluster 2 of K562 cells. Differential expression levels are shown next to the 1D representation of the ENm008 in the axis of the plot. (b) Relative abundance of different ENm008 fragment types to the center of their chromatin globules for GM12878 (upper plot) and K562 (lower plot). Plots show cumulative relative abundance of annotations vs. radial position in the globule. Active genes and promoters are enriched in the center. (c) Observed loops in the centroids of selected cluster for GM12878 (upper) and K562 (lower) models. The loops are placed over the 1D representation of the ENm008 region. Loop height is proportional to the path length of the loop. Loops are colored proportional to the distance between the anchor points (dark = near and light = far). Loop sizes in Kilobases (Kb) are indicated at the tip of the loop. (d) Chromatin density for the ensemble of solutions in cluster 1 of GM12878 models (blue) and cluster 2 of K562 models (red). DNAse I hypersensitive sites are shown next to the 1D representation of the ENm008 in the x-axis of the plot.
Figure 6
Figure 6
Diagram of the proposed chromatin globule model for higher-order chromatin folding of actively transcribed genomic regions.

Similar articles

See all similar articles

Cited by 138 articles

See all "Cited by" articles

Publication types

MeSH terms

LinkOut - more resources

Feedback