Machine Learning to Predict Binding Affinity

Methods Mol Biol. 2019;2053:251-273. doi: 10.1007/978-1-4939-9752-7_16.


Recent progress in the development of scientific libraries with machine-learning techniques paved the way for the implementation of integrated computational tools to predict ligand-binding affinity. The prediction of binding affinity uses the atomic coordinates of protein-ligand complexes. These new computational tools made application of a broad spectrum of machine-learning techniques to study protein-ligand interactions possible. The essential aspect of these machine-learning approaches is to train a new computational model by using technologies such as supervised machine-learning techniques, convolutional neural network, and random forest to mention the most commonly applied methods. In this chapter, we focus on supervised machine-learning techniques and their applications in the development of protein-targeted scoring functions for the prediction of binding affinity. We discuss the development of the program SAnDReS and its application to the creation of machine-learning models to predict inhibition of cyclin-dependent kinase and HIV-1 protease. Moreover, we describe the scoring function space, and how to use it to explain the development of targeted scoring functions.

Keywords: Binding affinity; Cyclin-dependent kinase; HIV-1 protease; Machine learning; Regression; SAnDReS; Scoring function space.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cyclin-Dependent Kinase 2 / chemistry
  • Databases, Protein
  • HIV Protease / chemistry
  • Humans
  • Ligands
  • Machine Learning*
  • Models, Statistical
  • Molecular Docking Simulation*
  • Molecular Dynamics Simulation*
  • Proteins / chemistry*
  • Software*
  • Supervised Machine Learning
  • Web Browser


  • Ligands
  • Proteins
  • Cyclin-Dependent Kinase 2
  • HIV Protease
  • p16 protease, Human immunodeficiency virus 1