Supervised machine learning techniques to predict binding affinity. A study for cyclin-dependent kinase 2

Biochem Biophys Res Commun. 2017 Dec 9;494(1-2):305-310. doi: 10.1016/j.bbrc.2017.10.035. Epub 2017 Oct 7.


Here we report the development of a machine-learning model to predict binding affinity based on the crystallographic structures of protein-ligand complexes. We used an ensemble of crystallographic structures (resolution better than 1.5 Å resolution) for which half-maximal inhibitory concentration (IC50) data is available. Polynomial scoring functions were built using as explanatory variables the energy terms present in the MolDock and PLANTS scoring functions. Prediction performance was tested and the supervised machine learning models showed improvement in the prediction power, when compared with PLANTS and MolDock scoring functions. In addition, the machine-learning model was applied to predict binding affinity of CDK2, which showed a better performance when compared with AutoDock4, AutoDock Vina, MolDock, and PLANTS scores.

Keywords: Bioinformatics; CDK2; Docking; Drug design; Kinase; Machine learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Antineoplastic Agents / chemistry*
  • Cyclin-Dependent Kinase 2 / antagonists & inhibitors*
  • Cyclin-Dependent Kinase 2 / chemistry
  • Databases, Protein
  • Datasets as Topic
  • Drug Design
  • Humans
  • Inhibitory Concentration 50
  • Ligands
  • Molecular Docking Simulation
  • Protein Kinase Inhibitors / chemistry*
  • ROC Curve
  • Supervised Machine Learning*
  • Thermodynamics


  • Antineoplastic Agents
  • Ligands
  • Protein Kinase Inhibitors
  • CDK2 protein, human
  • Cyclin-Dependent Kinase 2