CRMSS: predicting circRNA-RBP binding sites based on multi-scale characterizing sequence and structure features

Brief Bioinform. 2023 Jan 19;24(1):bbac530. doi: 10.1093/bib/bbac530.

Abstract

Circular RNAs (circRNAs) are reverse-spliced and covalently closed RNAs. Their interactions with RNA-binding proteins (RBPs) have multiple effects on the progress of many diseases. Some computational methods are proposed to identify RBP binding sites on circRNAs but suffer from insufficient accuracy, robustness and explanation. In this study, we first take the characteristics of both RNA and RBP into consideration. We propose a method for discriminating circRNA-RBP binding sites based on multi-scale characterizing sequence and structure features, called CRMSS. For circRNAs, we use sequence ${k}\hbox{-}{mer}$ embedding and the forming probabilities of local secondary structures as features. For RBPs, we combine sequence and structure frequencies of RNA-binding domain regions to generate features. We capture binding patterns with multi-scale residual blocks. With BiLSTM and attention mechanism, we obtain the contextual information of high-level representation for circRNA-RBP binding. To validate the effectiveness of CRMSS, we compare its predictive performance with other methods on 37 RBPs. Taking the properties of both circRNAs and RBPs into account, CRMSS achieves superior performance over state-of-the-art methods. In the case study, our model provides reliable predictions and correctly identifies experimentally verified circRNA-RBP pairs. The code of CRMSS is freely available at https://github.com/BioinformaticsCSU/CRMSS.

Keywords: Deep learning; RNA-binding protein; circular RNA.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites
  • RNA* / metabolism
  • RNA, Circular* / genetics
  • RNA-Binding Proteins / metabolism

Substances

  • RNA, Circular
  • RNA
  • RNA-Binding Proteins