CircMiner: accurate and rapid detection of circular RNA through splice-aware pseudo-alignment scheme

Bioinformatics. 2020 Jun 1;36(12):3703-3711. doi: 10.1093/bioinformatics/btaa232.

Abstract

Motivation: The ubiquitous abundance of circular RNAs (circRNAs) has been revealed by performing high-throughput sequencing in a variety of eukaryotes. circRNAs are related to some diseases, such as cancer in which they act as oncogenes or tumor-suppressors and, therefore, have the potential to be used as biomarkers or therapeutic targets. Accurate and rapid detection of circRNAs from short reads remains computationally challenging. This is due to the fact that identifying chimeric reads, which is essential for finding back-splice junctions, is a complex process. The sensitivity of discovery methods, to a high degree, relies on the underlying mapper that is used for finding chimeric reads. Furthermore, all the available circRNA discovery pipelines are resource intensive.

Results: We introduce CircMiner, a novel stand-alone circRNA detection method that rapidly identifies and filters out linear RNA sequencing reads and detects back-splice junctions. CircMiner employs a rapid pseudo-alignment technique to identify linear reads that originate from transcripts, genes or the genome. CircMiner further processes the remaining reads to identify the back-splice junctions and detect circRNAs with single-nucleotide resolution. We evaluated the efficacy of CircMiner using simulated datasets generated from known back-splice junctions and showed that CircMiner has superior accuracy and speed compared to the existing circRNA detection tools. Additionally, on two RNase R treated cell line datasets, CircMiner was able to detect most of consistent, high confidence circRNAs compared to untreated samples of the same cell line.

Availability and implementation: CircMiner is implemented in C++ and is available online at https://github.com/vpc-ccg/circminer.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • RNA Splicing
  • RNA* / genetics
  • RNA, Circular*
  • Sequence Analysis, RNA

Substances

  • RNA, Circular
  • RNA