Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Feb;21(2):164-74.
doi: 10.1101/gr.116038.110. Epub 2010 Dec 22.

Chromatin Signatures of the Drosophila Replication Program

Affiliations
Free PMC article

Chromatin Signatures of the Drosophila Replication Program

Matthew L Eaton et al. Genome Res. .
Free PMC article

Abstract

DNA replication initiates from thousands of start sites throughout the Drosophila genome and must be coordinated with other ongoing nuclear processes such as transcription to ensure genetic and epigenetic inheritance. Considerable progress has been made toward understanding how chromatin modifications regulate the transcription program; in contrast, we know relatively little about the role of the chromatin landscape in defining how start sites of DNA replication are selected and regulated. Here, we describe the Drosophila replication program in the context of the chromatin and transcription landscape for multiple cell lines using data generated by the modENCODE consortium. We find that while the cell lines exhibit similar replication programs, there are numerous cell line-specific differences that correlate with changes in the chromatin architecture. We identify chromatin features that are associated with replication timing, early origin usage, and ORC binding. Primary sequence, activating chromatin marks, and DNA-binding proteins (including chromatin remodelers) contribute in an additive manner to specify ORC-binding sites. We also generate accurate and predictive models from the chromatin data to describe origin usage and strength between cell lines. Multiple activating chromatin modifications contribute to the function and relative strength of replication origins, suggesting that the chromatin environment does not regulate origins of replication as a simple binary switch, but rather acts as a tunable rheostat to regulate replication initiation events.

Figures

Figure 1.
Figure 1.
The Drosophila replication program across three cell lines. (A) Replication program in S2-DRSC cells. Genome browser track of whole-genome S-phase replication timing profiles as the log2 ratio of early to late replicating sequences (red), early origin activity as the log2 ratio of BrdU enrichment to input DNA (blue), ORC-binding sites as input corrected ChIP-seq tag depth (orange), and gene models for a 500-kb region of chromosome 2L. (B) Overlap of early origins in three cell lines. The Venn diagram shows the overlap in total early origin peaks from each cell line. (C) Distribution of early origin meta-peaks per cell line. The percentage of early origin peaks found in three cell lines (light gray), two cell lines (medium gray), or one cell line (dark gray). (D) Overlap of ORC ChIP-seq peaks in three cell lines. As in B, the Venn diagram depicts the overlap in ORC peaks for each cell line. (E) Distribution of ORC meta-peaks per cell line. Same as C for ORC ChIP-seq peaks.
Figure 2.
Figure 2.
The chromatin landscape of the replication program. (A) Chromatin correlations with replication timing. The genome-wide replication timing profile of each cell line was paired with the genome-wide array scores for each chromatin factor, and the pairwise correlation of the factor with replication timing was computed (Spearman's ρ). The correlation ρ ranges from −0.5 (blue) to +.5 (red). (B) The chromatin landscape of early origins. The log2 enrichment for each factor within early origin peaks was determined for each cell line. The enrichment ranges from −2 (blue) to +2 (red). (C) The chromatin landscape of ORC-binding sites. ORC-associated sequences were divided into TSS proximal (overlapping a TSS) and TSS distal (not overlapping a TSS). The log2 enrichment for each factor within 500 bp of the ORC peak centers was determined for each cell line. The enrichment ranges from −2 (blue) to +2 (red). In all panels, gray boxes represent an experiment that has not yet been submitted to modENCODE. See Methods for details.
Figure 3.
Figure 3.
Sequence, chromatin, and DNA-binding proteins classify ORC-binding sites. (A) SVM performance was gauged by the ROC curve resulting from separately using sequence features, chromatin mark features, binding protein features, or a combination of all three. In each case, the SVM was trained using 10-fold cross validation on three chromosome arms (2L, 3L, and 3R) and tested on a fourth chromosome arm (2R). (B) The importance of individual features was determined by plotting the F-score as a function of class proximity represented here as a t-statistic.
Figure 4.
Figure 4.
Chromatin signatures are predictive of early origin activity. (A) Subsets of factors are highly correlated at early origins. (Left heatmap) The pairwise correlation between every factor based on their mean signal at early origin meta-peaks in Bg3 cells was computed (where green indicates a negative correlation and red indicates a positive correlation, with values ranging from −0.76 to 1). Five groups of correlated marks were identified by hierarchical clustering. (Right heatmap) The mean enrichment of each cluster in Bg3 active (+) and Bg3 inactive (−) early origin meta-peaks. (B) Classification of Bg3 early origin usage from the full set of early origin meta-peaks by logistic regression. A logistic regression model using the average chromatin scores of each of the five clusters in Bg3 cells is able to classify (above and below the 0.5 horizontal dashed line as true and false, respectively) with 78% accuracy those meta-peaks that are used in Bg3 on chromosome 3R (blue) and those that are not (gray). (Inset) Predictive power for each cluster individually and the ensemble model. (C) Predicting relative origin strength between Bg3 and S2 cells by linear regression. A linear regression using the change in strength of the chromatin signal from five clusters between Bg3 and S2 is able to predict the change in strength of the early origin meta-peaks between the two cell lines. Predicted change in early origin strength between Bg3 and S2 is plotted as function of actual change. Early origins active in S2 (red) or Bg3 (blue) are indicated. The Pearson correlation is ∼0.7. (Inset) The RMSE over random (horizontal dashed line) for each cluster individually and the ensemble model.

Similar articles

See all similar articles

Cited by 100 articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback