Unbiasing scoring functions: a new normalization and rescoring strategy

J Chem Inf Model. 2007 Jul-Aug;47(4):1564-71. doi: 10.1021/ci600471m. Epub 2007 Jun 7.

Abstract

Ligand bias can contribute significantly to the number of false positives observed in a virtual screening campaign. Using a receptor-based docking approach against a well-established target of therapeutic importance, estrogen receptor alpha (ERalpha), coupled with several common scoring functions (ChemGuass, ChemGauss2, ChemScore, ScreenScore, ShapeGauss, and PLP), taken both individually and as a consensus, we sought to examine the characteristics of molecules retrieved by each. It has been previously shown that scoring functions (mainly empirical) exhibit bias in prioritizing more complicated molecules arising from additive components within the function. Using Spearmen's correlation coefficient, we show that a large set of descriptors calculated for a docked set of molecules exhibit positive correlation with the ranked position in a hitlist. Moreover, most of these descriptors correlate well with MW. To this end, rather than penalizing the docked score of all heavy molecular weight (MW) molecules and rewarding those of lower MW, as is common practice, we examine the impact of penalizing the score only of those molecules which were of higher MW, leaving lower MW molecules unaffected. Here, we introduce a new power function to aid the process. Using scoring frequency analysis and SIFt fingerprints, we acheived a more meaningful analysis of virtual screening (VS) performance than with enrichment calculations, facilitating target-specific VS method development.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bias*
  • Chemistry, Pharmaceutical*
  • Hydrogen Bonding