Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 May 5;11(5):821-34.
doi: 10.1016/j.celrep.2015.03.070. Epub 2015 Apr 23.

High-resolution Profiling of Drosophila Replication Start Sites Reveals a DNA Shape and Chromatin Signature of Metazoan Origins

Affiliations
Free PMC article

High-resolution Profiling of Drosophila Replication Start Sites Reveals a DNA Shape and Chromatin Signature of Metazoan Origins

Federico Comoglio et al. Cell Rep. .
Free PMC article

Abstract

At every cell cycle, faithful inheritance of metazoan genomes requires the concerted activation of thousands of DNA replication origins. However, the genetic and chromatin features defining metazoan replication start sites remain largely unknown. Here, we delineate the origin repertoire of the Drosophila genome at high resolution. We address the role of origin-proximal G-quadruplexes and suggest that they transiently stall replication forks in vivo. We dissect the chromatin configuration of replication origins and identify a rich spatial organization of chromatin features at initiation sites. DNA shape and chromatin configurations, not strict sequence motifs, mark and predict origins in higher eukaryotes. We further examine the link between transcription and origin firing and reveal that modulation of origin activity across cell types is intimately linked to cell-type-specific transcriptional programs. Our study unravels conserved origin features and provides unique insights into the relationship among DNA topology, chromatin, transcription, and replication initiation across metazoa.

Figures

Figure 1
Figure 1. High-resolution mapping of the Drosophila origin repertoire
(A) Percentage overlap of origin peaks identified in S2, Bg3 and Kc (Cayrou et al., 2011) Drosophila cells and comparison of observed pairwise overlaps (lines) with random expectations (boxplots). n, total number of origin peaks. (B) Percentage overlap of S2 and Bg3 origin peaks and modENCODE early origin regions (EOR). (C) Origin score of EOR overlapping with S2 and Bg3 origins (common) or solely identified by modENCODE (specific). (D) Efficiency of S2 and Bg3 origins partitioned and color-coded according to (A). Background estimates are shown. (E) A representative snapshot of the SNS-Seq coverage in S2 and Bg3 cells from two biological replicates and detected origin peaks. A single EOR (green) spans most of this 175 kb genomic region. Kc (gray) and constitutive (black) origins are also shown. p-values are from Wilcoxon rank-sum test. See also Figure S1.
Figure 2
Figure 2. A G-quadruplex signature at S2 replication origins
(A) Spatial distribution of G4 motifs within ± 2 kb of S2 RSSs. (B) Same as (A) for strand-specific annotation of G4 L1-15 motifs. Arrows indicate peak distances (bp) from the RSS. (C) S2 SNS-Seq signals within ± 2.5 kb of origin-associated G4 L1-15 motifs occurring on the plus (left) and minus (right) strands, ranked by coefficient of variation. Bottom panels show the average of the signals above (Ori+) and at origin-negative (Ori−) G4 motifs. Arrows indicate the direction of the leading strand facing the G4. (D) Model describing how origin-proximal G4 motifs could orient (black arrows) replication forks. Leading strands (long arrows) and Okazaki fragments (short) replicating the plus (red) and minus (blue) strands are indicated. See also Figure S2.
Figure 3
Figure 3. Origin-proximal G-quadruplexes stall replication forks in vivo
(A) An outline of the experimental strategy used to indirectly monitor replication fork progression at origin-associated G4. Fractions 4-6, corresponding to marker lanes 4-6, were individually purified and subjected to two sequential rounds of T4 PNK phosphorylation and Lexo digestion. Two sequencing libraries were prepared for each sample and origin peaks were called on their union. (B) A representative snapshot of the single-fraction SNS-Seq coverage. Origin peaks identified in each fraction and S2 origin peaks from standard SNS-Seq experiments are shown. (C) Average single-fraction SNS-Seq signal within ± 2.5 kb of origin-associated G4 L1-15 motifs occurring on the plus (left) and minus (right) strands. (D) Two representative G4 motifs occurring on opposite strands are shown. (E) Model describing how origin-proximal G4 motifs could act as replication fork barriers. Origin-proximal G4 pause the synthesis of the nascent leading strands replicating the G4 template. See also Figure S3.
Figure 4
Figure 4. Specific DNA shape features mark metazoan replication origins
(A) Relative frequency of AAAA polynucleotides and AA dinucleotides within ± 250 bp of S2 RSSs. (B) Same as (A) for AT and GC dinucleotides. (C-F) Average of DNA shape features within ± 1 kb of RSSs for constitutive Drosophila origins, background regions and TSSs. The latter were extended while preserving orientation. Solid lines are Loess fitted curves from single-nucleotide resolution shape predictions (dots). Boxplots of average feature values within 500 bp windows (thick black lines) are shown (bottom panels). p-values are from Wilcoxon rank-sum test. See also Figure S4.
Figure 5
Figure 5. The chromatin composition of Drosophila replication origins
(A) Percentage overlap of S2 origin peaks with DHSs and random expectation. (B) Efficiency of S2 origins localizing within (+) or outside (−) DHSs. (C) Average DNase-Seq enrichment within ± 2.5 kb of S2 RSSs and within ten sets of randomized genomic regions (Rand). The thick gray line traces average background values. (D) Spatial distribution of SNS-Seq signal within ± 2.5 kb of S2 RSSs (top), metaprofiles comparing origins with ten sets of randomized genomic regions (middle), and further partitioning of the signal above in four timing classes (L: late S-phase; M: mid; E: early) based on replication timing quartiles (bottom). (E) Same as (D) for MNase-Seq. (F-G) Same as (C) for the indicated features. p-values are from Wilcoxon rank-sum test. See also Figure S5.
Figure 6
Figure 6. Origin activity of CG-rich regions is predicted by chromatin landscape and transcriptional output
(A) Percentage overlap of origin peaks associated with CGRs in S2, Bg3 and Kc (Cayrou et al., 2011) cells, and comparison of the observed overlap between S2 and Bg3 origin peaks and CGRs (lines) with random expectation (boxplots). n, total number of CGRs. (B) Two representative snapshots of the S2 SNS-Seq coverage from two biological replicates, origin peaks and poly(A)+ RNA-Seq coverage across several CGRs. Constitutive origins (black) are also shown. (C) Spatial distribution of SNS-Seq signal within ± 2.5 kb of S2 origin-CGR midpoints (top) and metaprofiles (bottom) comparing origin-CGRs (Ori+) with origin-negative CGRs (Ori−). (D) Same as (C) for MNase-Seq. (E) Same as bottom panel of (C) for the indicated features. (F) An outline of the modeling strategy used to classify CGRs. (G) ROC curves and AUC values for lasso models trained on the indicated sets of features. The inset shows selection probabilities of the top-ranked features selected by bootstrap-lasso. Bars are color-coded according to coefficient signs (positive, red; negative, blue) and absolute value of coefficient z-scores. (H) S2/Bg3 RNA-Seq fold change for the indicated classes of origin-CGRs. p-values are from Wilcoxon rank-sum test. See also Figure S6.
Figure 7
Figure 7. Differential origin activity mirrors differences in cell-type-specific transcriptional programs
(A) Spatial distribution of S2 SNS-Seq signal within ± 2.5 kb of RSSs of origin peaks solely identified in Bg3 cells (top) and metaprofiles (bottom) comparing all S2 origin peaks with these sites. (B) Same as (A) for MNase-Seq. (C-D) Same as bottom panel of (A) for the indicated features. (E) Scatter plot of S2 and Bg3 RNA-Seq signals at differentially activated origins (DAOs) that were more efficiently used by S2 (DAO+) or Bg3 (DAO−) cells. Triangles, constitutive origins; circles, origin peaks solely detected in one cell type. Opacity reflects the statistical significance of differential origin activity and is proportional to -log10-transformed adjusted p-values. (F) S2/Bg3 RNA-Seq fold change of equally activated origins (unchanged) and of differentially activated ones. p-values are from Wilcoxon rank-sum test. (G) ROC curves and AUC values for origin-classifiers trained on the indicated set of features in Drosophila. The inset shows selection probabilities of the top-ranked features selected by bootstrap-lasso and used to train the simplified model. Bars are color-coded according to coefficient signs (positive, red; negative, blue) and absolute values of coefficient z-scores. (H) Same as (G) for constitutive and HeLa-specific human origins. See also Figure S7.

Similar articles

See all similar articles

Cited by 28 articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback