Translating from Proteins to Ribonucleic Acids for Ligand-binding Site Detection

Mol Inform. 2022 Oct;41(10):e2200059. doi: 10.1002/minf.202200059. Epub 2022 Jun 1.

Abstract

Identifying druggable ligand-binding sites on the surface of the macromolecular targets is an important process in structure-based drug discovery. Deep-learning models have been shown to successfully predict ligand-binding sites of proteins. As a step toward predicting binding sites in RNA and RNA-protein complexes, we employ three-dimensional convolutional neural networks. We introduce a dataset splitting approach to minimize structure-related bias in training data, and investigate the influence of protein-based neural network pre-training before fine-tuning on RNA structures. Models that were pre-trained on proteins considerably outperformed the models that were trained exclusively on RNA structures. Overall, 71 % of the known RNA binding sites were correctly located within 4 Å of their true centres.

Keywords: RNA; deep learning; drug discovery; drug target; neural network.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites
  • Ligands
  • Neural Networks, Computer*
  • Proteins* / chemistry
  • RNA / metabolism

Substances

  • Ligands
  • Proteins
  • RNA