The scoring bias in reverse docking and the score normalization strategy to improve success rate of target fishing

Qiyao Luo; Liang Zhao; Jianxing Hu; Hongwei Jin; Zhenming Liu; Liangren Zhang

doi:10.1371/journal.pone.0171433

The scoring bias in reverse docking and the score normalization strategy to improve success rate of target fishing

PLoS One. 2017 Feb 14;12(2):e0171433. doi: 10.1371/journal.pone.0171433. eCollection 2017.

Authors

Qiyao Luo¹, Liang Zhao¹, Jianxing Hu¹, Hongwei Jin¹, Zhenming Liu¹, Liangren Zhang¹

Affiliation

¹ State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing, P. R. China.

Abstract

Target fishing often relies on the use of reverse docking to identify potential target proteins of ligands from protein database. The limitation of reverse docking is the accuracy of current scoring funtions used to distinguish true target from non-target proteins. Many contemporary scoring functions are designed for the virtual screening of small molecules without special optimization for reverse docking, which would be easily influenced by the properties of protein pockets, resulting in scoring bias to the proteins with certain properties. This bias would cause lots of false positives in reverse docking, interferring the identification of true targets. In this paper, we have conducted a large-scale reverse docking (5000 molecules to 100 proteins) to study the scoring bias in reverse docking by DOCK, Glide, and AutoDock Vina. And we found that there were actually some frequency hits, namely interference proteins in all three docking procedures. After analyzing the differences of pocket properties between these interference proteins and the others, we speculated that the interference proteins have larger contact area (related to the size and shape of protein pockets) with ligands (for all three docking programs) or higher hydrophobicity (for Glide), which could be the causes of scoring bias. Then we applied the score normalization method to eliminate this scoring bias, which was effective to make docking score more balanced between different proteins in the reverse docking of benchmark dataset. Later, the Astex Diver Set was utilized to validate the effect of score normalization on actual cases of reverse docking, showing that the accuracy of target prediction significantly increased by 21.5% in the reverse docking by Glide after score normalization, though there was no obvious change in the reverse docking by DOCK and AutoDock Vina. Our results demonstrate the effectiveness of score normalization to eliminate the scoring bias and improve the accuracy of target prediction in reverse docking. Moreover, the properties of protein pockets causing scoring bias to certain proteins we found here can provide the theory basis to further optimize the scoring functions of docking programs for future research.

MeSH terms

Algorithms*
Bias
Binding Sites
Computational Biology / methods*
Databases, Protein
Hydrophobic and Hydrophilic Interactions
Ligands
Molecular Docking Simulation / methods*
Protein Binding
Protein Conformation
Protein Interaction Mapping / methods*
Proteins / chemistry*
Proteins / metabolism
Reproducibility of Results
Software

Substances

Ligands
Proteins

Grants and funding

This work was supported by the National Natural Science Foundation of China (Grant No. 21272017, 21572010). Author: Zhenming Liu [http://www.nsfc.gov.cn/]. These funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.