Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Apr 8;10(1):1598.
doi: 10.1038/s41467-019-09551-w.

Multiplexed Cas9 targeting reveals genomic location effects and gRNA-based staggered breaks influencing mutation efficiency

Affiliations

Multiplexed Cas9 targeting reveals genomic location effects and gRNA-based staggered breaks influencing mutation efficiency

Santiago Gisler et al. Nat Commun. .

Abstract

Understanding the impact of guide RNA (gRNA) and genomic locus on CRISPR-Cas9 activity is crucial to design effective gene editing assays. However, it is challenging to profile Cas9 activity in the endogenous cellular environment. Here we leverage our TRIP technology to integrate ~ 1k barcoded reporter genes in the genomes of mouse embryonic stem cells. We target the integrated reporters (IRs) using RNA-guided Cas9 and characterize induced mutations by sequencing. We report that gRNA-sequence and IR locus explain most variation in mutation efficiency. Predominant insertions of a gRNA-specific nucleotide are consistent with template-dependent repair of staggered DNA ends with 1-bp 5' overhangs. We confirm that such staggered ends are induced by Cas9 in mouse pre-B cells. To explain observed insertions, we propose a model generating primarily blunt and occasionally staggered DNA ends. Mutation patterns indicate that gRNA-sequence controls the fraction of staggered ends, which could be used to optimize Cas9-based insertion efficiency.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Overview of CRISPR-Cas9 assays in TRIP cell line and pools. a Barcoded TRIP reporter construct. b Clonal PGK-driven TRIP cell line with 36 IRs (left), and TRIP pool containing ~ 1k IRs with various promoters (right) - CMV, cMyc, Hoxb1, Nanog, Oct4, p53, PGK. Genomic location and expression of IRs were determined by DNA and RNA sequencing prior to Cas9 targeting of IR regions using different guides. Targeted DNA sequencing of IR regions was further used to characterize mutations arising from repair of Cas9-induced DSBs. (c) Cas9-guide RNA combinations used in independent assays. TRIP cell line was targeted using Cas9 complexes with sgRNA1, sgRNA2 or sgRNA3 (left). In TRIP pool assays, Cas9 was complexed with sgRNA2 or sgRNA3 (right). Knock-in of a single-stranded oligodeoxynuceotide (ssODN) was performed with sgRNA2
Fig. 2
Fig. 2
Contribution of IR locus, guide RNA, promoter and ssODN to Cas9-induced mutation frequency. a Frequency per outcome in cell line Cas9 assays, showing effects of IR locus and guide RNA. Each bar represents one of 36 IRs in the cell line, and each colored band denotes the fraction of reads exhibiting a particular outcome among all reads mapped to such IR (vertical axis). Outcomes: wild-type in light blue, deletion in red, insertion in dark blue, and complex mix of mutations in beige. b Frequency per outcome in TRIP pool assays, for guide RNA and ssODN inclusion combinations. Dots denote frequency (vertical axis) per outcome (color) for 1359 IRs with at least 30 reads in all assays. Boxes show the median, first and third quartiles of the frequency distributions; whiskers extend to 1.5 times the inter-quartile range from the top and bottom of the box. c Frequency per outcome in TRIP pools, stratified by promoter. Each bar denotes the subset of IRs associated with a given promoter; colored bands denote median frequency per outcome. d Correlation of IR mutation frequency across TRIP pool assays. Each dot indicates mutation frequency of a given IR in two different experiments (horizontal and vertical axes). Linear regression lines and corresponding R2 values denote correlations. e Ratio between knock-in and error-based insertions (vertical axis) with respect to binned IR mutation frequency (horizontal axis). Grey dots indicate ratios for individual IRs, black dots denote mean ratios within bins, blue ribbon shows 0.95 confidence interval around the mean. f Goodness-of-fit of linear regression model predicting mutation frequency based on IR locus, guide RNA, ssODN, promoter, and (locus, guide) interaction term. g Effect size or variance explained by variables in the regression model. Plotted are eta squared values for multi-way ANOVA tests based on type II sum of squares. Source data are provided in the Source Data file
Fig. 3
Fig. 3
Correlation of TGE features with Cas9-induced IR mutation frequency in the TRIP pool. a Genomic location of the 1359 IRs with at least 30 mapped reads in all TRIP pool Cas9 assays. Each tick denotes the location of an IR on the chromosome, colored according to the associated promoter. b Correlation of TGE features with IR mutation frequency per guide RNA. Boxplots show the distribution of absolute Pearson’s correlations between deletion (red) or insertion (blue) frequency and each of 82 distinct TGE features across IRs. Boxes show the median, first and third quartiles of the frequency distributions; whiskers extend to 1.5 times the inter-quartile range from the top and bottom of the box. c Correlation of IR mutation frequency with TGE features stratified per category. Boxplots show the distribution of absolute Pearson’s correlations between deletion or insertion frequency and each of 82 TGE features stratified into six categories (color-coded according to legend). d Correlation between IR expression or IR mutation frequency and TGE features. Heatmap shows the Pearson’s correlation between IR expression or IR mutation frequency (deletion or insertion) in the different TRIP pool assays (rows), and individual TGE features from a subset of 62 (columns), including all except transcriptional regulators without known epigenetic activity. Cells are gradient-colored based on correlation values, and color intensity denotes significance of adjusted p-value. Colored circles at the top indicate TGE feature categories. Source data are provided in the Source Data file
Fig. 4
Fig. 4
Mutation patterns induced by Cas9 in the 36-integration TRIP cell line. a Observed deletion and insertion sizes. Heatmaps show the overall frequency (color gradient) of deletions (red, left) and insertions (blue, right) per size (horizontal axis) for each guide RNA (vertical axis) in the TRIP cell line. b Deletion patterns and positions. Shown for each guide RNA are the ten most frequent deletion patterns with respect to the non-target DNA, from top to bottom in decreasing order of frequency. Each horizontal bar indicates the position of a deletion pattern, and corresponding non-target DNA sequence lost (at the bottom), colored according to frequency. Expected 3|4 and alternative 4|5 break sites are indicated by two vertical dashed lines. c Frequency of sites neighboring the ten most frequently deleted regions for each guide RNA, shown in Fig. 3b. Three vertical bars indicate the proportion of: all such deletions regardless of neighboring site (all, red), the subset of those deletions neighboring the expected break site (3|4, green), or the subset of those deletions neighboring the alternative break site (4|5, orange). For deletions with ambiguous positions, we weighted the frequencies by the ratio of positions meeting the criteria. We observed similar trends using all data. d Frequency of each nucleotide in 1-bp insertions. For each guide RNA, boxplots show the frequency (vertical axis) of insertions of each nucleotide (horizontal axis and color) across the 36 IRs (dots). Boxes show the median, first and third quartiles of the frequency distributions; whiskers extend to 1.5 times the inter-quartile range from the top and bottom of the box. Source data are provided in the Source Data file
Fig. 5
Fig. 5
One-nucleotide insertion patterns and DNA end structures at the break site. a Illustration of blunt and staggered DNA ends at the break site, and expected distribution of 1-bp insertions of the four nucleotides following DNA repair. Double-stranded sequences including PAM and 8-bp upstream, with bottom and top denoting target and non-target DNA. Blue straight and staggered lines through the sequences indicate blunt and staggered DNA ends. Colored bars on top sketch the expected distribution of 1-bp insertions upon DNA repair. Blunt model: blunt-ends primarily at 3|4 upstream of the PAM (straight line), resulting in template-independent insertion and thus similar frequencies of the four nucleotides (uniform distribution, similar-height colored bars). Staggered model: staggered ends mostly with termini at 3|4 (tDNA) and 4|5 (ntDNA) upstream of the PAM (staggered line), with template-dependent fill-in resulting in a skewed distribution with most insertions of the DNA base identical to nucleotide 4 (unequal-height colored bars). b Unambiguous insertion counts (filled bars) and ambiguous insertion counts (empty bars) redistributed according to blunt, staggered, and combined models. Shown are insertion counts (vertical axis) of each nucleotide (color) per site on the ntDNA (horizontal axis). Vertical shaded areas indicate the 3|4 and 4|5 sites upstream of the PAM. Unambiguous counts are directly determined from the data (filled bars), whereas ambiguous counts are redistributed over windows of ambiguous sites (empty bars) based on: (i) relative proportions of unambiguous counts, and (ii) likelihood of each nucleotide insertion according to the cleavage model. (c) Re-analysis of DNA ends generated by Cas9 targeting of a region on chromosome 6 in mouse pre-B cells deficient in DNA Ligase IV and arrested in G1 phase. Bar length denotes relative frequency, shown for the ten most frequent DNA end structures accounting for ~ 91% of all unique patterns in the data. Absolute frequencies are displayed. Multiple DNA end structures associated with the same sequence are grouped with a single bar and label. Bars are colored by type of structure. The bottom left figure shows an illustration of two DNA structures: blunt (3|4t, 3|4nt), and 5´ 1-bp overhang (3|4t, 4|5nt). Source data are provided in the Source Data file
Fig. 6
Fig. 6
Illustration of DNA repair outcomes after Cas9-induced double-strand break. Both blunt and staggered ends can be directly ligated back into wild-type sequence, or generate a deletion through resection by nuclease activity prior to ligation. Blunt ends can also result in an insertion by template-independent addition of a random nucleotide, possibly established by Pol µ. Staggered ends lead primarily to template-dependent insertions, possibly established by polymerases such as Pol µ or Pol λ
Fig. 7
Fig. 7
Potential applications of CRISPR-on-TRIP. RNA-guided Cas9 targeting of regions within integrated TRIP reporters (CRISPR-on-TRIP) can be combined with other assays to investigate effects of various processes on Cas9-induced mutation frequency and patterns

Similar articles

Cited by

References

    1. Mali P, et al. RNA-guided human genome engineering via Cas9. Science. 2013;339:823–826. doi: 10.1126/science.1232033. - DOI - PMC - PubMed
    1. Cong L, et al. Multiplex genome engineering using CRISPR/Cas systems. Science. 2013;339:819–823. doi: 10.1126/science.1231143. - DOI - PMC - PubMed
    1. Jinek M, et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science. 2012;337:816–821. doi: 10.1126/science.1225829. - DOI - PMC - PubMed
    1. Jiang F, Zhou K, Ma L, Gressel S, Doudna JA. A Cas9-guide RNA complex preorganized for target DNA recognition. Science. 2015;348:1477–1481. doi: 10.1126/science.aab1452. - DOI - PubMed
    1. Jinek M, et al. Structures of Cas9 endonucleases reveal RNA-mediated conformational activation. Science. 2014;343:1247997. doi: 10.1126/science.1247997. - DOI - PMC - PubMed

Publication types