Extending pKa prediction accuracy: high-throughput pKa measurements to understand pKa modulation of new chemical series

Eur J Med Chem. 2010 Sep;45(9):4270-9. doi: 10.1016/j.ejmech.2010.06.026. Epub 2010 Jun 23.

Abstract

We have recently developed a tool, MoKa, to predict the pK(a) of organic compounds using a large dataset of over 26,500 literature pK(a) values as a training set. However, predicting accurately pK(a) (<0.5 pH units) remains challenging for novel series, and this can be a drawback in the optimization of activity and ADME properties of lead compounds. To address this issue it is important to expand our knowledge of pK(a) determinants, therefore we have conducted high-throughput pK(a) measurements by using Spectral Gradient Analysis (SGA) on novel series of compounds selected from vendor databases. Here we report our findings on the effect of specific chemical groups and steric constraints on the pK(a) of common functionalities in medicinal chemistry, such as amines, sulfonamides, and amides. Furthermore, we report the pK(a) of ionizable groups that were not well represented in the database of literature pK(a) of MoKalpha, such as hydrazide derivatives. These findings helped us to enhance MoKalpha, which is here benchmarked on a set of experimental pK(a) values from the Roche in-house library (N = 5581; RMSE = 1.09; R2 = 0.82). The accuracy of the predictions was greatly improved (RMSE = 0.49, R2 = 0.96) after training the software by using the automated tool Kibitzer with 6226 pK(a) values taken from a different set of Roche compounds appropriately selected, and this demonstrates the value of using high-throughput pK(a) measurements to expand the training set of pK(a) values used by the software MoKalpha.

MeSH terms

  • Amides / chemistry
  • Amines / chemistry
  • Benchmarking
  • Chemical Phenomena*
  • Hydrazines / chemistry
  • Organic Chemicals / chemistry*
  • Sulfonamides / chemistry

Substances

  • Amides
  • Amines
  • Hydrazines
  • Organic Chemicals
  • Sulfonamides
  • hydrazine