RNA3DCNN: Local and global quality assessments of RNA 3D structures using 3D deep convolutional neural networks

PLoS Comput Biol. 2018 Nov 27;14(11):e1006514. doi: 10.1371/journal.pcbi.1006514. eCollection 2018 Nov.

Abstract

Quality assessment is essential for the computational prediction and design of RNA tertiary structures. To date, several knowledge-based statistical potentials have been proposed and proved to be effective in identifying native and near-native RNA structures. All these potentials are based on the inverse Boltzmann formula, while differing in the choice of the geometrical descriptor, reference state, and training dataset. Via an approach that diverges completely from the conventional statistical potentials, our work explored the power of a 3D convolutional neural network (CNN)-based approach as a quality evaluator for RNA 3D structures, which used a 3D grid representation of the structure as input without extracting features manually. The RNA structures were evaluated by examining each nucleotide, so our method can also provide local quality assessment. Two sets of training samples were built. The first one included 1 million samples generated by high-temperature molecular dynamics (MD) simulations and the second one included 1 million samples generated by Monte Carlo (MC) structure prediction. Both MD and MC procedures were performed for a non-redundant set of 414 RNAs. For two training datasets (one including only MD training samples and the other including both MD and MC training samples), we trained two neural networks, named RNA3DCNN_MD and RNA3DCNN_MDMC, respectively. The former is suitable for assessing near-native structures, while the latter is suitable for assessing structures covering large structural space. We tested the performance of our method and made comparisons with four other traditional scoring functions. On two of three test datasets, our method performed similarly to the state-of-the-art traditional scoring function, and on the third test dataset, our method was far superior to other scoring functions. Our method can be downloaded from https://github.com/lijunRNA/RNA3DCNN.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Datasets as Topic
  • Hot Temperature
  • Molecular Dynamics Simulation
  • Monte Carlo Method
  • Neural Networks, Computer*
  • Nucleic Acid Conformation*
  • RNA / chemistry*

Substances

  • RNA

Grants and funding

This work was funded by National Natural Science Foundation of China (http://www.nsfc.gov.cn/) (Grant No. 11774158 to JZ, 31671026 to JL, 11774157 to JW, 11574132 to WL, and 11334004 to WW). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.