Machine learning enabled identification of potential SARS-CoV-2 3CLpro inhibitors based on fixed molecular fingerprints and Graph-CNN neural representations

J Biomed Inform. 2021 Jul:119:103821. doi: 10.1016/j.jbi.2021.103821. Epub 2021 May 28.

Abstract

Aim: Rapidly developing AI and machine learning (ML) technologies can expedite therapeutic development and in the time of current pandemic their merits are particularly in focus. The purpose of this study was to explore various ML approaches for molecular property prediction and illustrate their utility for identifying potential SARS-CoV-2 3CLpro inhibitors.

Materials and methods: We perform a series of drug discovery screenings based on supervised ML models operating in different ways on molecular representations, encompassing shallow learning methods based on fixed molecular fingerprints, Graph Convolutional Neural Network (Graph-CNN) with its self-learned molecular representations, as well as ML methods based on combining fixed and Graph-CNN learned representations.

Results: Results of our ML models are compared both with respect to the aggregated predictive performance in terms of ROC-AUC based on the scaffold splits, as well as on the granular level of individual predictions, corresponding to the top ranked repurposing candidates. This comparison reveals both certain characteristic homogeneity regarding chemical and pharmacological classification, with a prevalence of sulfonamides and anticancer drugs, as well as identifies novel groups of potential drug candidates against COVID-19.

Conclusions: A series of ML approaches for molecular property prediction enables drug discovery screenings, illustrating the utility for COVID-19. We show that the obtained results correspond well with the already published research on COVID-19 treatment, as well as provide novel insights on potential antiviral characteristics inferred from in vitro data.

Keywords: AI drug repurposing; COVID-19; Graph Convolutional Neural Network; Machine learning; Molecular property prediction; SARS-CoV-2.

MeSH terms

  • COVID-19 Drug Treatment*
  • Humans
  • Machine Learning
  • Neural Networks, Computer
  • SARS-CoV-2*