Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Mar 5:6:20.
doi: 10.3389/fcell.2018.00020. eCollection 2018.

Improved circRNA Identification by Combining Prediction Algorithms

Affiliations

Improved circRNA Identification by Combining Prediction Algorithms

Thomas B Hansen. Front Cell Dev Biol. .

Abstract

Non-coding RNA is an interesting class of gene regulators with diverse functionalities. One large subgroup of non-coding RNAs is the recently discovered class of circular RNAs (circRNAs). CircRNAs are conserved and expressed in a tissue and developmental specific manner, although for the vast majority, the functional relevance remains unclear. To identify and quantify circRNAs expression, several bioinformatic pipelines have been developed to assess the catalog of circRNAs in any given total RNA sequencing dataset. We recently compared five different algorithms for circRNA detection, but here this analysis is extended to 11 algorithms. By comparing the number of circRNAs discovered and their respective sensitivity to RNaseR digestion, the sensitivity and specificity of each algorithm are evaluated. Moreover, the ability to predict de novo circRNA, i.e., circRNAs not derived from annotated splice sites, is also determined as well as the effect of eliminating low quality and adaptor-containing reads prior to circRNA prediction. Finally, and most importantly, all possible pair-wise combinations of algorithms are tested and guidelines for algorithm complementarity are provided. Conclusively, the algorithms mostly agree on highly expressed circRNAs, however, in many cases, algorithm-specific false positives with high read counts are predicted, which is resolved by using the shared output from two (or more) algorithms.

Keywords: bioinformatics; circular RNA; combining algorithms; gene prediction; non-coding RNA.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Specificity and sensitivity. (A) Stacked barplot of all predicted circRNAs stratified by RNAse R resistant (≥ 5 fold enrichment, green), unaffected (1–5 fold enrichment, gray) and RNAse R sensitive (depleted in RNaseR treated samples, red), as denoted. Percentage reflects the fraction of RNaseR sensitive circRNAs defined as false positives. (B) Cumulative fraction plot of read-counts for circRNAs shared by all 11 algorithms (n = 259) color coded as denoted in the associated boxplot where reads per circRNA is shown. (C) Ranked plot of the top 100 expressed circRNAs predicted by each algorithm color-coded as in A. Percentage reflects the fraction of RNase R sensitive circRNAs (false positives) within the plotted top 100. (D) Boxplot of circRNA expression predicted by each algorithm stratified by RNaseR sensitivity (as in A).
Figure 2
Figure 2
Raw vs. processed reads. (A) Stacked barplot as in Figure 1A comparing the output from raw (“Raw,” as seen in Figure 1A) and circRNA prediction using pre-processed reads (“Processed”). (B) Boxplot comparing the read-counts on circRNAs shared between “Raw” and “Processed” prediction.
Figure 3
Figure 3
De novo prediction of circRNAs. (A) Stacked barplot comparing the annotated and un-annotated (de novo) default outputs from ACFS, CIRCexplorer2, and KNIFE. (B) Stacked barplot comparing overall circRNA predictions output from algorithms (ACFS, CIRCexplorer2, and KNIFE) either guided by annotation (default setting, as in Figure 1A) or when forced de novo using mock annotations with algorithms de novo by default (circRNA_finder, DCC, CIRI, CIRI2, and find_circ, as in Figure 1A). (C) Back-splice spanning read counts on ciRS-7 obtained from each algorithm as an example of de novo prediction. For KNIFE, the de novo resolution is 50 bp and ciRS-7 was here defined as chrX:139865300-139866900.
Figure 4
Figure 4
Conjoining algorithms. (A) Comprehensive stacked barplot analysis of RNaseR sensitivity in the shared predictions by any two algorithms. The “dimmed” bars denote the unpaired algorithm (as also seen in Figure 1A). (B) Loess regression on fraction of circRNAs found by other algorithm (color-coded as seen in the legend) as a function of ranked expression of circRNAs identified and quantified by algorithm denoted in strip. (C) Heatmap on Complementary score. The Complementary score is calculated as (iTF × iTN)2, where iTF is the fraction of true positive circRNAs (RNaseR resistant circRNAs) found in algorithm denoted on the y-axis and shared with algorithm on x-axis (Supplementary Figure 6A), and iTN is 1-fN, where fN is the fraction of RNaseR sensitive species conjointly identified in other algorithm (see Supplementary Figure 6B). Complementary scores ≥ 0.2 are denoted specifically. (D) For each algorithm, the maximum Complementary score (from C) is depicted.

Similar articles

Cited by

References

    1. Cheng J., Metge F., Dieterich C. (2016). Specific identification and quantification of circular RNAs from sequencing data. Bioinformatics 32, 1094–1096. 10.1093/bioinformatics/btv656 - DOI - PubMed
    1. Gao Y., Wang J., Zhao F. (2015). CIRI: an efficient and unbiased algorithm for de novo circular RNA identification. Genome Biol. 16:4. 10.1186/s13059-014-0571-3 - DOI - PMC - PubMed
    1. Gao Y., Zhang J., Zhao F. (2017). Circular RNA identification based on multiple seed matching. Brief. Bioinform. [Epub ahead of print]. 10.1093/bib/bbx014 - DOI - PubMed
    1. Hansen T. B., Jensen T. I., Clausen B. H., Bramsen J. B., Finsen B., Damgaard C. K., et al. . (2013). Natural RNA circles function as efficient microRNA sponges. Nature 495, 384–388. 10.1038/nature11993 - DOI - PubMed
    1. Hansen T. B., Venø M. T., Damgaard C. K., Kjems J., Venø M. T., Damgaard C. K., et al. . (2015). Comparison of circular RNA prediction tools. Nucleic Acids Res. 44:e58. 10.1093/nar/gkv1458 - DOI - PMC - PubMed

LinkOut - more resources