Overview of the SAMPL6 pKa challenge: evaluating small molecule microscopic and macroscopic pKa predictions
- PMID: 33394238
- PMCID: PMC7904668
- DOI: 10.1007/s10822-020-00362-6
Overview of the SAMPL6 pKa challenge: evaluating small molecule microscopic and macroscopic pKa predictions
Abstract
The prediction of acid dissociation constants (pKa) is a prerequisite for predicting many other properties of a small molecule, such as its protein-ligand binding affinity, distribution coefficient (log D), membrane permeability, and solubility. The prediction of each of these properties requires knowledge of the relevant protonation states and solution free energy penalties of each state. The SAMPL6 pKa Challenge was the first time that a separate challenge was conducted for evaluating pKa predictions as part of the Statistical Assessment of Modeling of Proteins and Ligands (SAMPL) exercises. This challenge was motivated by significant inaccuracies observed in prior physical property prediction challenges, such as the SAMPL5 log D Challenge, caused by protonation state and pKa prediction issues. The goal of the pKa challenge was to assess the performance of contemporary pKa prediction methods for drug-like molecules. The challenge set was composed of 24 small molecules that resembled fragments of kinase inhibitors, a number of which were multiprotic. Eleven research groups contributed blind predictions for a total of 37 pKa distinct prediction methods. In addition to blinded submissions, four widely used pKa prediction methods were included in the analysis as reference methods. Collecting both microscopic and macroscopic pKa predictions allowed in-depth evaluation of pKa prediction performance. This article highlights deficiencies of typical pKa prediction evaluation approaches when the distinction between microscopic and macroscopic pKas is ignored; in particular, we suggest more stringent evaluation criteria for microscopic and macroscopic pKa predictions guided by the available experimental data. Top-performing submissions for macroscopic pKa predictions achieved RMSE of 0.7-1.0 pKa units and included both quantum chemical and empirical approaches, where the total number of extra or missing macroscopic pKas predicted by these submissions were fewer than 8 for 24 molecules. A large number of submissions had RMSE spanning 1-3 pKa units. Molecules with sulfur-containing heterocycles or iodo and bromo groups were less accurately predicted on average considering all methods evaluated. For a subset of molecules, we utilized experimentally-determined microstates based on NMR to evaluate the dominant tautomer predictions for each macroscopic state. Prediction of dominant tautomers was a major source of error for microscopic pKa predictions, especially errors in charged tautomers. The degree of inaccuracy in pKa predictions observed in this challenge is detrimental to the protein-ligand binding affinity predictions due to errors in dominant protonation state predictions and the calculation of free energy corrections for multiple protonation states. Underestimation of ligand pKa by 1 unit can lead to errors in binding free energy errors up to 1.2 kcal/mol. The SAMPL6 pKa Challenge demonstrated the need for improving pKa prediction methods for drug-like molecules, especially for challenging moieties and multiprotic molecules.
Keywords: Acid dissociation constant; Blind prediction challenge; Macroscopic pK a; Macroscopic protonation state; Microscopic pK a; Microscopic protonation state; SAMPL; Small molecule; pK a.
Figures
Similar articles
-
pKa measurements for the SAMPL6 prediction challenge for a set of kinase inhibitor-like fragments.J Comput Aided Mol Des. 2018 Oct;32(10):1117-1138. doi: 10.1007/s10822-018-0168-0. Epub 2018 Nov 7. J Comput Aided Mol Des. 2018. PMID: 30406372 Free PMC article.
-
Assessing the accuracy of octanol-water partition coefficient predictions in the SAMPL6 Part II log P Challenge.J Comput Aided Mol Des. 2020 Apr;34(4):335-370. doi: 10.1007/s10822-020-00295-0. Epub 2020 Feb 27. J Comput Aided Mol Des. 2020. PMID: 32107702 Free PMC article.
-
Evaluation of log P, pKa, and log D predictions from the SAMPL7 blind challenge.J Comput Aided Mol Des. 2021 Jul;35(7):771-802. doi: 10.1007/s10822-021-00397-3. Epub 2021 Jun 24. J Comput Aided Mol Des. 2021. PMID: 34169394 Free PMC article.
-
Overview of the SAMPL5 host-guest challenge: Are we doing better?J Comput Aided Mol Des. 2017 Jan;31(1):1-19. doi: 10.1007/s10822-016-9974-4. Epub 2016 Sep 22. J Comput Aided Mol Des. 2017. PMID: 27658802 Free PMC article. Review.
-
The SAMPL4 host-guest blind prediction challenge: an overview.J Comput Aided Mol Des. 2014 Apr;28(4):305-17. doi: 10.1007/s10822-014-9735-1. Epub 2014 Mar 6. J Comput Aided Mol Des. 2014. PMID: 24599514 Free PMC article. Review.
Cited by
-
The maximal and current accuracy of rigorous protein-ligand binding free energy calculations.Commun Chem. 2023 Oct 14;6(1):222. doi: 10.1038/s42004-023-01019-9. Commun Chem. 2023. PMID: 37838760 Free PMC article.
-
MF-SuP-pKa: Multi-fidelity modeling with subgraph pooling mechanism for pKa prediction.Acta Pharm Sin B. 2023 Jun;13(6):2572-2584. doi: 10.1016/j.apsb.2022.11.010. Epub 2022 Nov 11. Acta Pharm Sin B. 2023. PMID: 37425064 Free PMC article.
-
pK50─A Rigorous Indicator of Individual Functional Group Acidity/Basicity in Multiprotic Compounds.J Chem Inf Model. 2023 May 22;63(10):3198-3208. doi: 10.1021/acs.jcim.3c00187. Epub 2023 Apr 27. J Chem Inf Model. 2023. PMID: 37104727 Free PMC article.
-
Benchmarking quantum chemical methods for accurate gas-phase structure predictions of carbonyl compounds: the case of ethyl butyrate.Phys Chem Chem Phys. 2023 Mar 15;25(11):7688-7696. doi: 10.1039/d2cp05774c. Phys Chem Chem Phys. 2023. PMID: 36857713 Free PMC article.
-
Improving Small Molecule pK a Prediction Using Transfer Learning With Graph Neural Networks.Front Chem. 2022 May 26;10:866585. doi: 10.3389/fchem.2022.866585. eCollection 2022. Front Chem. 2022. PMID: 35721000 Free PMC article.
References
-
- de Oliveira C, Yu HS, Chen W, Abel R, Wang L. Rigorous Free Energy Perturbation Approach to Estimating Relative Binding Affinities between Ligands with Multiple Protonation and Tautomeric States. Journal of Chemical Theory and Computation. 2019. January; 15(1):424–435. doi: 10.1021/acs.jctc.8b00826. - DOI - PubMed
-
- Darvey IG. The Assignment of pKa Values to Functional Groups in Amino Acids. Biochemical Education. 1995. April; 23(2):80–82. doi: 10.1016/0307-4412(94)00150-N. - DOI
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
