Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Oct 3;15(12):2641-2653.
doi: 10.7150/ijbs.37152. eCollection 2019.

Evaluation of the effects of sequence length and microsatellite instability on single-guide RNA activity and specificity

Affiliations

Evaluation of the effects of sequence length and microsatellite instability on single-guide RNA activity and specificity

Changzhi Zhao et al. Int J Biol Sci. .

Abstract

Clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 technology is effective for genome editing and now widely used in life science research. However, the key factors determining its editing efficiency and off-target cleavage activity for single-guide RNA (sgRNA) are poorly documented. Here, we systematically evaluated the effects of sgRNA length on genome editing efficiency and specificity. Results showed that sgRNA 5'-end lengths can alter genome editing activity. Although the number of predicted off-target sites significantly increased after sgRNA length truncation, sgRNAs with different lengths were highly specific. Because only a few predicted off-targets had detectable cleavage activity as determined by Target capture sequencing (TargetSeq). Interestingly, > 20% of the predicted off-targets contained microsatellites for selected sgRNAs targeting the dystrophin gene, which can produce genomic instability and interfere with accurate assessment of off-target cleavage activity. We found that sgRNA activity and specificity can be sensitively detected by TargetSeq in combination with in silico prediction. Checking whether the on- and off-targets contain microsatellites is necessary to improve the accuracy of analyzing the efficiency of genome editing. Our research provides new features and novel strategies for the accurate assessment of CRISPR sgRNA activity and specificity.

Keywords: CRISPR/Cas9; activity; length; microsatellite; specificity.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interest exists.

Figures

Figure 1
Figure 1
Difference in the activity at the same genomic location in different lengths of sgRNAs. (A) Scheme of using 20, 19, 18, 17 nts of sgRNAs on target genes. The sequence patterns are recognized including N20NGG, N19NGG, N18NGG, and N17NGG. (B) Activities of sgRNAs in different lengths on target genes using T7ENI cleavage assay. “NGG” represents protospacer adjacent motif (PAM) sequences, N represents one of four bases, including adenine (A), guanine (G), cytosine (C), and thymine (T); sgR: single-guide RNAs; bp: base pairs; DL2000:DNA marker, control: wild-type control cells.
Figure 2
Figure 2
Predicted specificity of sgRNAs in different lengths targeting DMD gene. (A) Differences in the predicted number of off-target sites with 1, 2, 3, 4, 5 nucleotide mismatches. T-test was performed on the total predicted off-target sites of sgRNAs with 19, 18, 17 nts against 20 nt in length. P values are 1.5e-11, 3.17e-16, 9.87e-17, respectively. (B) Difference in the predicted total number of off-target sites of 20, 19, 18, 17 nt sgRNAs. (C) Venn diagram of the predicted off-target sites in different lengths of sgRNAs. nt: nucleotides; sgR: small guide RNA; M representsthe number of nucleotide mismatches (1M, 2M, 3M, 4M, or 5M); 0M represents the perfect match to the on-target site; off-target sites are counted at the number of 0M sites > 1.
Figure 3
Figure 3
The sensitivities of different methods in detecting the activity of 20 nt-long sgRNA. (A) The activity of sgRNA on targeting DMD gene by T7ENI cleavage assay. (B) PCR products were sequenced in both directions. (C) Assessing sgRNA activity by four methods. These methods includes T7ENI cleavage assay; Sanger DNA sequencing followed by TIDE web-based software analysis; High-throughput amplicon sequencing (AmpliconSeq); and Target capture sequencing (TargetSeq). Bulk population represents unsorted cells; Sorted population represents sorted cells; Mock control represents Lipofectamine 2000 only; WT: wild-type cells; TIDE(F): Sanger DNA forward-sequencing data; and TIDE(R): Sanger DNA reverse-sequencing data.
Figure 4
Figure 4
Detection of on- and off-target site cleavage activities in different lengths of sgRNAs but targeting DMD gene based on in silico prediction and Target capture sequencing. (A), (B), (C), and (D) are the results of the on- and off-target cleavage efficiencies of 20, 19, 18, and 17 nt sgRNAs by Target capture sequencing in the sorted cell population, respectively. T (X:31227642:+) represents the on-target site, while OT for off-target site of sgRNA. The x- and y-axis represent the Indel efficiencies of on- and off-targets for sgRNAs in the control and gene-editing groups, respectively. The number of reads ≧ 10 is the threshold of control group representing the captured target, and the Indel efficiency is ≦ 1 %. (E) Validation of off-target sites for 20, 19, 18, and 17 nt sgRNAs using T7ENI cleavage assay. The Indel efficiency below the agarose gel electrophoresis shows the detection of the same predicted off-target site by the T7ENI cleavage assay and TargetSeq. Seed region represents seed sequences, which are the first 1-12 positions of the spacer immediately in the 5′ end to the PAM sequence. Control represents the negative control group; DL2000: DNA ladder; Nucleotides marked in red and blue colors represent protospacer adjacent motif (PAM), and mismatches and OT for off-target, respectively, OT: predicted off-target site.
Figure 5
Figure 5
Indel frequency and reads distribution of CRISPR/Cas9 off-target sites containing microsatellites in the control group. (A) Distribution of captured predicted off-target sites with different sequence features in the control group. (B) Distribution of the editing efficiency and reads in predicted off-target sites with different sequence features in the control group. (C) Distribution of the microsatellites in predicted off-target sites in 20, 19, 18, and 17 nt sgRNAs. NRG: protospacer adjacent motif (PAM); N = A, T, C, or G; R = A or G; STR: short tandem repeat; OT: predicted off-target site.
Figure 6
Figure 6
Validation of selected predicted off-target sites containing microsatellites. (A) Sequence and microsatellite features of potential off-target sites. (B) Detection of cleavage activity of the predicted off-target sites by T7ENI cleavage assay. Control represents negative control group. POT: potential off-target site; ID: identity number; NRG: protospacer adjacent motif (PAM); N = A, T, C, or G; R = A or G. Nucleotides marked in red and blue colors represent PAM and mismatches, respectively.

Similar articles

Cited by

References

    1. Jinek M, Chylinski K, Fonfara I. et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science. 2012;337:816–821. - PMC - PubMed
    1. Cong L, Ran FA, Cox D. et al. Multiplex genome engineering using CRISPR/Cas systems. Science. 2013;339:819–823. - PMC - PubMed
    1. Mali P, Yang L, Esvelt KM. et al. RNA-guided human genome engineering via Cas9. Science. 2013;339:823–826. - PMC - PubMed
    1. Shen B, Zhang J, Wu H, Generation of gene-modified mice via Cas9/RNA-mediated gene targeting. Cell Res; 2013. p. 23. 720-723. - PMC - PubMed
    1. Li X, Wang Y, Liu Y. et al. Base editing with a Cpf1-cytidine deaminase fusion. Nat Biotechnol. 2018;36:324–327. - PubMed

Publication types

Substances