RIsearch: fast RNA-RNA interaction search using a simplified nearest-neighbor energy model

Anne Wenzel; Erdinç Akbasli; Jan Gorodkin

doi:10.1093/bioinformatics/bts519

RIsearch: fast RNA-RNA interaction search using a simplified nearest-neighbor energy model

Bioinformatics. 2012 Nov 1;28(21):2738-46. doi: 10.1093/bioinformatics/bts519. Epub 2012 Aug 24.

Authors

Anne Wenzel¹, Erdinç Akbasli, Jan Gorodkin

Affiliation

¹ Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg, Denmark.

Abstract

Motivation: Regulatory, non-coding RNAs often function by forming a duplex with other RNAs. It is therefore of interest to predict putative RNA-RNA duplexes in silico on a genome-wide scale. Current computational methods for predicting these interactions range from fast complementary-based searches to those that take intramolecular binding into account. Together these methods constitute a trade-off between speed and accuracy, while leaving room for improvement within the context of genome-wide screens. A fast pre-filtering of putative duplexes would therefore be desirable.

Results: We present RIsearch, an implementation of a simplified Turner energy model for fast computation of hybridization, which significantly reduces runtime while maintaining accuracy. Its time complexity for sequences of lengths m and n is with a much smaller pre-factor than other tools. We show that this energy model is an accurate approximation of the full energy model for near-complementary RNA-RNA duplexes. RIsearch uses a Smith-Waterman-like algorithm using a dinucleotide scoring matrix which approximates the Turner nearest-neighbor energies. We show in benchmarks that we achieve a speed improvement of at least 2.4× compared with RNAplex, the currently fastest method for searching near-complementary regions. RIsearch shows a prediction accuracy similar to RNAplex on two datasets of known bacterial short RNA (sRNA)-messenger RNA (mRNA) and eukaryotic microRNA (miRNA)-mRNA interactions. Using RIsearch as a pre-filter in genome-wide screens reduces the number of binding site candidates reported by miRNA target prediction programs, such as TargetScanS and miRanda, by up to 70%. Likewise, substantial filtering was performed on bacterial RNA-RNA interaction data.

Availability: The source code for RIsearch is available at: http://rth.dk/resources/risearch.

Publication types

Research Support, Non-U.S. Gov't
Validation Study

MeSH terms

Algorithms*
Base Pairing
Base Sequence
Binding Sites
Cluster Analysis
Computer Simulation*
Genes, Duplicate
Information Storage and Retrieval / methods*
MicroRNAs / chemistry
MicroRNAs / genetics
MicroRNAs / metabolism
Models, Molecular*
Position-Specific Scoring Matrices
RNA / chemistry
RNA / genetics
RNA / metabolism*
RNA, Bacterial / chemistry
RNA, Messenger / chemistry
RNA, Messenger / genetics
RNA, Messenger / metabolism
RNA, Untranslated / genetics*
Sequence Alignment

Substances

MicroRNAs
RNA, Bacterial
RNA, Messenger
RNA, Untranslated
RNA