Exploring the Potential of Spherical Harmonics and PCVM for Compounds Activity Prediction

Int J Mol Sci. 2019 May 2;20(9):2175. doi: 10.3390/ijms20092175.

Abstract

Biologically active chemical compounds may provide remedies for several diseases. Meanwhile, Machine Learning techniques applied to Drug Discovery, which are cheaper and faster than wet-lab experiments, have the capability to more effectively identify molecules with the expected pharmacological activity. Therefore, it is urgent and essential to develop more representative descriptors and reliable classification methods to accurately predict molecular activity. In this paper, we investigate the potential of a novel representation based on Spherical Harmonics fed into Probabilistic Classification Vector Machines classifier, namely SHPCVM, to compound the activity prediction task. We make use of representation learning to acquire the features which describe the molecules as precise as possible. To verify the performance of SHPCVM ten-fold cross-validation tests are performed on twenty-one G protein-coupled receptors (GPCRs). Experimental outcomes (accuracy of 0.86) assessed by the classification accuracy, precision, recall, Matthews' Correlation Coefficient and Cohen's kappa reveal that using our Spherical Harmonics-based representation which is relatively short and Probabilistic Classification Vector Machines can achieve very satisfactory performance results for GPCRs.

Keywords: G protein-coupled receptors; cheminformatics; machine learning; molecular activity predictions; molecular representation; representation learning.

MeSH terms

  • Algorithms
  • Animals
  • Databases, Protein
  • Drug Discovery / methods*
  • Humans
  • Machine Learning*
  • Receptors, G-Protein-Coupled / metabolism*
  • Support Vector Machine

Substances

  • Receptors, G-Protein-Coupled