RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning

Nat Commun. 2019 Nov 27;10(1):5407. doi: 10.1038/s41467-019-13395-9.

Abstract

The majority of our human genome transcribes into noncoding RNAs with unknown structures and functions. Obtaining functional clues for noncoding RNAs requires accurate base-pairing or secondary-structure prediction. However, the performance of such predictions by current folding-based algorithms has been stagnated for more than a decade. Here, we propose the use of deep contextual learning for base-pair prediction including those noncanonical and non-nested (pseudoknot) base pairs stabilized by tertiary interactions. Since only [Formula: see text]250 nonredundant, high-resolution RNA structures are available for model training, we utilize transfer learning from a model initially trained with a recent high-quality bpRNA dataset of [Formula: see text]10,000 nonredundant RNAs made available through comparative analysis. The resulting method achieves large, statistically significant improvement in predicting all base pairs, noncanonical and non-nested base pairs in particular. The proposed method (SPOT-RNA), with a freely available server and standalone software, should be useful for improving RNA structure modeling, sequence alignment, and functional annotations.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Base Pairing
  • Computational Biology / methods
  • Databases, Genetic
  • Deep Learning
  • Humans
  • Neural Networks, Computer*
  • Protein Structure, Secondary
  • RNA / chemistry*
  • RNA, Untranslated / chemistry
  • Software*

Substances

  • RNA, Untranslated
  • RNA