Benefit of Retraining pKa Models Studied Using Internally Measured Data

J Chem Inf Model. 2015 Jul 27;55(7):1449-59. doi: 10.1021/acs.jcim.5b00172. Epub 2015 Jun 29.

Abstract

The ionization state of drugs influences many pharmaceutical properties such as their solubility, permeability, and biological activity. It is therefore important to understand the structure property relationship for the acid-base dissociation constant pKa during the lead optimization process to make better-informed design decisions. Computational approaches, such as implemented in MoKa, can help with this; however, they often predict with too large error especially for proprietary compounds. In this contribution, we look at how retraining helps to greatly improve prediction error. Using a longitudinal study with data measured over 15 years in a drug discovery environment, we assess the impact of model training on prediction accuracy and look at model degradation over time. Using the MoKa software, we will demonstrate that regular retraining is required to address changes in chemical space leading to model degradation over six to nine months.

MeSH terms

  • Chemical Phenomena*
  • Machine Learning*
  • Models, Theoretical*
  • Reproducibility of Results