Improving Docking-Based Virtual Screening Ability by Integrating Multiple Energy Auxiliary Terms from Molecular Docking Scoring

J Chem Inf Model. 2020 Sep 28;60(9):4216-4230. doi: 10.1021/acs.jcim.9b00977. Epub 2020 May 11.

Abstract

Virtual Screening (VS) based on molecular docking is an efficient method used for retrieving novel hit compounds in drug discovery. However, the accuracy of the current docking scoring function (SF) is usually insufficient. In this study, in order to improve the screening power of SF, a novel approach named EAT-Score was proposed by directly utilizing the energy auxiliary terms (EAT) provided by molecular docking scoring through eXtreme Gradient Boosting (XGBoost). Here, EAT specifically refers to the output of the Molecular Operating Environment (MOE) scoring, including the energy scores of five different classical SFs and the Protein-Ligand Interaction Fingerprint (PLIF) terms. The performance of EAT-Score to discriminate actives from decoys was strictly validated on the DUD-E diverse subset by using different performance metrics. The results showed that EAT-Score performed much better than classical SFs in VS, with its AUC values exhibiting an improvement of around 0.3. Meanwhile, EAT-Score could achieve comparable even better prediction performance compared with other state-of-the-art VS methods, such as some machine learning (ML)-based SFs and classical SFs implemented in docking programs, in terms of AUC, LogAUC, or BEDROC. Furthermore, the EAT-Score model can capture important binding pattern information from protein-ligand complexes by Shapley additive explanations (SHAP) analysis, which may be very helpful in interpreting the ligand binding mechanism for a certain target and thereby guiding drug design.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Ligands
  • Machine Learning*
  • Molecular Docking Simulation
  • Protein Binding
  • Proteins* / metabolism

Substances

  • Ligands
  • Proteins