Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 26 (9), 1277-87

Diverse Alternative Back-Splicing and Alternative Splicing Landscape of Circular RNAs

Affiliations

Diverse Alternative Back-Splicing and Alternative Splicing Landscape of Circular RNAs

Xiao-Ou Zhang et al. Genome Res.

Abstract

Circular RNAs (circRNAs) derived from back-spliced exons have been widely identified as being co-expressed with their linear counterparts. A single gene locus can produce multiple circRNAs through alternative back-splice site selection and/or alternative splice site selection; however, a detailed map of alternative back-splicing/splicing in circRNAs is lacking. Here, with the upgraded CIRCexplorer2 pipeline, we systematically annotated different types of alternative back-splicing and alternative splicing events in circRNAs from various cell lines. Compared with their linear cognate RNAs, circRNAs exhibited distinct patterns of alternative back-splicing and alternative splicing. Alternative back-splice site selection was correlated with the competition of putative RNA pairs across introns that bracket alternative back-splice sites. In addition, all four basic types of alternative splicing that have been identified in the (linear) mRNA process were found within circRNAs, and many exons were predominantly spliced in circRNAs. Unexpectedly, thousands of previously unannotated exons were detected in circRNAs from the examined cell lines. Although these novel exons had similar splice site strength, they were much less conserved than known exons in sequences. Finally, both alternative back-splicing and circRNA-predominant alternative splicing were highly diverse among the examined cell lines. All of the identified alternative back-splicing and alternative splicing in circRNAs are available in the CIRCpedia database (http://www.picb.ac.cn/rnomics/circpedia). Collectively, the annotation of alternative back-splicing and alternative splicing in circRNAs provides a valuable resource for depicting the complexity of circRNA biogenesis and for studying the potential functions of circRNAs in different cells.

Figures

Figure 1.
Figure 1.
An upgraded computational pipeline (CIRCexplorer2) to systematically identify alternative (back-)splicing in back-spliced circular RNAs (circRNAs). (A) Schematic diagrams of two types of alternative back-splicing. Colored bars, exons. Black lines, introns. Red polylines, (canonical) collinear splicing. Red arc lines, back-splicing (circularization). (B) Schematic diagrams of four basic types of alternative splicing. Colored bars, exons. Black lines, introns. Red lines, splicing. Red arc lines, back-splicing (circularization). (C) The schematic diagram of CIRCexplorer2. The analysis was performed as described (Zhang et al. 2014) with modifications (Supplemental Methods). Alternative back-splicing and alternative splicing in circRNAs were determined with stringent criteria (Supplemental Methods). (D) Ten thousand circRNAs (gray bars) were detected by CIRCexplorer2. Thousands of novel exons (blue points) were identified in circRNAs in different human cell lines with de novo assembly (Supplemental Methods). (E) The identification and visualization of circRNAs in the CAMSAP1 locus from H9 (left panel) or PA1 (right panel) cell lines. Different types of RNA-seq data sets from ribo, p(A)+, p(A) or p(A)/RNase R RNA populations were used for comparison. CAMSAP1 circRNAs could be determined from ribo, p(A), and p(A)/RNase R RNA-seq data sets by identifying back-splice junctions. Notably, ribo RNA-seq is not suitable to study canonical splicing events (intron retention, in this case) that occur specifically within circRNAs, as ribo RNAs contain both polyadenylated and nonpolyadenylated transcripts. Blues bars, exons. Black lines, introns. Black thick line, the retained intron. Red arc lines, back-splicing (circularization).
Figure 2.
Figure 2.
The diverse landscape of alternative back-splicing. (A) Approximately 12%–57% of back-splice sites are alternatively selected among high-confidence expressed circRNAs with RPM (mapped back-splice junction Reads Per Million mapped reads) ≥ 0.1. (B) A schematic diagram of alternative 5′ back-splicing and its quantification (top panel). The use of proximal and distal 5′ back-splice sites can be quantitated by the Percent Circularized-site Usage (PCU, bottom panel) with detected back-splice junction reads (i and j, respectively). (C) Diverse usage of alternative 5′ (left) and 3′ (right) back-splice sites among different cell lines. Each blue vertical line denotes PCU variation for one circRNA from the first quartile (Q1) to the third quartile (Q3) across cell lines, and each black vertical line denotes PCU variation from the minimum to the maximum. Note that only highly expressed circRNAs with RPM ≥ 0.1 in at least three cell lines were used for this analysis. (D) Visualization of alternative 5′ back-splice site usage in circRNAs produced from the RBM23 locus across different cell lines. Predicted circRNAs in the RBM23 locus were indicated by red arc lines with raw back-splice junction reads and normalized RPMs (numbers above each arc line, left panel). The use of proximal alternative 5′ back-splice site (PCUd) or distal alternative 5′ back-splice site (PCUp) was calculated accordingly (right panel).
Figure 3.
Figure 3.
Competition of RNA pairs flanking proximal or distal back-splice sites leads to alternative back-splice site selection. (A,B) Potential RNA pairs (red dashed arc lines) produced by orientation-opposite complementary sequences (red arrows) flanking proximal (left top panels) or distal (left bottom panels) back-splice sites lead to alternative 5′ (A)/3′ (B) back-splice site selection, respectively (red arc lines). The competition of RNA pairs flanking proximal or distal back-splice sites leads to alternative back-splice site selection. More than 70% of the highly expressed circRNAs (RPM ≥ 0.1) with alternative back-splice site selection contain potential paired complementary sequences flanking both proximal and distal 5′/3′ back-splice sites (right panels). (C) Recapitulation of alternative back-splicing. (Left) A schematic drawing of egfp expression vectors with engineered complementary sequences for POLR2A circular RNA recapitulation. Half egfp sequences from the expression vector backbone are indicated as gray bars. POLR2A exonic and intronic sequences are indicated as colored bars and light purple lines, respectively. Nonrepetitive complementary sequences (red arrows) were inserted into multiple POLR2A intronic regions to form different RNA pairs (red dashed arc lines). Northern blot (NB) probes are indicated as colored bars. (Right) Validation of alternatively back-spliced POLR2A circRNAs by Northern blot on denaturing PAGE gel. Note that only partial complementary sequence (∼100 bp) was inserted into the middle intron for smaller POLR2A circRNA. (*) Linear RNA background.
Figure 4.
Figure 4.
Unannotated exons produced from alternative back-splicing. (A) At least four novel exons (white bars) in the human MED13L locus were identified in PA1 p(A) and/or p(A)/RNase R RNA-seq data sets. The predicted circRNAs in the MED13L locus were indicated by red arc lines with raw back-splice junction reads from p(A) (red) and/or p(A)/RNase R (purple) RNA-seq data sets. Alternative back-splice sites were determined by both RNA-seq (top panel) and Sanger sequencing (bottom panel). Note that these new exons were barely detected in linear counterparts from parallel p(A)+ RNA-seq (the wiggle track in black). (B) Validation of multiple MED13L circRNAs with previously unannotated exons by Northern blot on native agarose gel. Note that the validation of these MED13L circRNAs with novel exons was consistent with RT-PCR (Supplemental Fig. S6B). (*) Linear RNA background. (C) Hundreds to thousands of circRNAs were identified with novel back-splice sites across different cell lines. (D) Splicing strength of novel back-splice sites is comparable to that of annotated back-splice sites. (**) P value < 0.01, Wilcoxon rank-sum test. (E) Novel (red) and annotated back-spliced exons (blue) have similar GC contents. (F) Novel back-spliced exons (red) are less conserved in sequences than are annotated back-spliced exons (blue).
Figure 5.
Figure 5.
Characterization of circRNA-predominant alternative cassette exons. (A) A strategic pipeline to identify high-confidence circRNA-predominant alternative cassette exons. By comparing the alternative cassette exon selection between p(A)+ and p(A) RNA-seq data sets, circRNA-predominant alternative cassette exons were selected using stringent criteria. (B) The high-confidence circRNA-predominant cassette exons were determined with expression in at least two cell lines. (C) The strength of the 5′/3′ splice sites of high-confidence circRNA-predominant cassette exons (blue) was comparable with those of cassette exons identified in linear RNAs (light gray) and constitutive exons (dark gray). (ns) Not significant, (**) P value <0.01, Wilcoxon rank-sum test. (D) Similar densities of exonic splicing enhancers (ESE) were identified between high-confidence circRNA-predominant cassette exons (blue) and cassette exons identified in linear RNAs (light gray) and constitutive exons (dark gray). (ns) Not significant, (*) P value < 0.05, Wilcoxon rank-sum test. (E) The high-confidence circRNA-predominant cassette exons (blue) were slightly less conserved than were cassette exons identified in linear RNAs (light gray) and constitutive exons (dark gray).
Figure 6.
Figure 6.
Unannotated circRNA-predominant alternative cassette exons. (A) Hundreds of previously unannotated circRNA-predominant cassette exons (red) were identified in circRNAs from individual cell lines. (B) Previously unannotated circRNA-predominant cassette exons (red) were much less conserved than the annotated circRNA-predominant cassette exons (blue) or cassette exons in linear RNAs (gray). (C) Validation of circRNA-predominant cassette exons in PA1 cells by RT-PCR. Similar to the RNA-seq results (PSI ratio), semiquantitative RT-PCR showed the detection of six circRNA-predominant cassette exons from the p(A) and p(A)/RNase R RNA population but barely any from p(A)+ RNAs. The inclusion ratio of circRNA-predominant cassette exons from RT-PCR was determined by Quantity One (Bio-Rad). Note that circRNA-predominant cassette exons in PIP5K1C, PRRC2B, and RBPMS loci (bottom) were not previously annotated by RefGene, UCSC Known Genes, or Ensembl. Magenta bars, circRNA-predominant cassette exons. Blue bars, known exons. Divergent PCR primers were indicated as black arrows. (D) Validation of circRNA-predominant cassette exon inclusion from different cell lines by Northern blot on denaturing PAGE gels. The circRNAs with alternative cassette exon inclusion/exclusion in both PIP5K1C and PRRC2B loci were detected in PA1 and H9 cell lines. Note that the circRNA-predominant cassette exons in PIP5K1C or PRRC2B loci are previously unannotated. Magenta in the circles, circRNA-predominant cassette exons. Blue in the circles, known exons. (*) Linear RNA background. (E) Visualization of circRNA-predominant cassette exon inclusion in PIP5K1C locus from different cell lines. The inclusion ratio of the circRNA-predominant cassette exons from RNA-seq was indicated by PSI. Note that the circRNA-predominant cassette exon in the PIP5K1C locus is a newly identified exon in this study. Magenta bar, circRNA-predominant cassette exon. Blue bars, other exons.

Similar articles

See all similar articles

Cited by 126 PubMed Central articles

See all "Cited by" articles

LinkOut - more resources

Feedback