Modeling RNA Secondary Structure Folding Ensembles Using SHAPE Mapping Data

Nucleic Acids Res. 2018 Jan 9;46(1):314-323. doi: 10.1093/nar/gkx1057.

Abstract

RNA secondary structure prediction is widely used for developing hypotheses about the structures of RNA sequences, and structure can provide insight about RNA function. The accuracy of structure prediction is known to be improved using experimental mapping data that provide information about the pairing status of single nucleotides, and these data can now be acquired for whole transcriptomes using high-throughput sequencing. Prior methods for using these experimental data focused on predicting structures for sequences assuming that they populate a single structure. Most RNAs populate multiple structures, however, where the ensemble of strands populates structures with different sets of canonical base pairs. The focus on modeling single structures has been a bottleneck for accurately modeling RNA structure. In this work, we introduce Rsample, an algorithm for using experimental data to predict more than one RNA structure for sequences that populate multiple structures at equilibrium. We demonstrate, using SHAPE mapping data, that we can accurately model RNA sequences that populate multiple structures, including the relative probabilities of those structures. This program is freely available as part of the RNAstructure software package.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Base Pairing / genetics
  • Base Sequence
  • Computational Biology / methods*
  • Models, Molecular*
  • RNA / chemistry*
  • RNA / genetics
  • RNA Folding*
  • Reproducibility of Results

Substances

  • RNA