Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul;35(7):771-802.
doi: 10.1007/s10822-021-00397-3. Epub 2021 Jun 24.

Evaluation of log P, pKa, and log D predictions from the SAMPL7 blind challenge

Affiliations
Free PMC article

Evaluation of log P, pKa, and log D predictions from the SAMPL7 blind challenge

Teresa Danielle Bergazin et al. J Comput Aided Mol Des. 2021 Jul.
Free PMC article

Abstract

The Statistical Assessment of Modeling of Proteins and Ligands (SAMPL) challenges focuses the computational modeling community on areas in need of improvement for rational drug design. The SAMPL7 physical property challenge dealt with prediction of octanol-water partition coefficients and pKa for 22 compounds. The dataset was composed of a series of N-acylsulfonamides and related bioisosteres. 17 research groups participated in the log P challenge, submitting 33 blind submissions total. For the pKa challenge, 7 different groups participated, submitting 9 blind submissions in total. Overall, the accuracy of octanol-water log P predictions in the SAMPL7 challenge was lower than octanol-water log P predictions in SAMPL6, likely due to a more diverse dataset. Compared to the SAMPL6 pKa challenge, accuracy remains unchanged in SAMPL7. Interestingly, here, though macroscopic pKa values were often predicted with reasonable accuracy, there was dramatically more disagreement among participants as to which microscopic transitions produced these values (with methods often disagreeing even as to the sign of the free energy change associated with certain transitions), indicating far more work needs to be done on pKa prediction methods.

Keywords: Free energy calculations; SAMPL; log P; pK a.

PubMed Disclaimer

Conflict of interest statement

David L. Mobley serves on the Scientific Advisory Board of OpenEye Scientific Software and is an Open Science Fellow with Silicon Therapeutics, a subsidiary of Ruyvant.

Figures

Fig. 1
Fig. 1
Structures of the 22 molecules used for the SAMPL7 physical property blind prediction challenge. Log of the partition coefficient between n-octanol and water was determined via potentiometric titrations using a Sirius T3 instrument. pKa values were determined by potentiometric titrations using a Sirius T3 instrument. Log of the distribution coefficient between n-octanol and aqueous buffer at pH 7.4 were determined via potentiometric titrations using a Sirius T3 instrument, except for compounds SM27, SM28, SM30-SM34, SM36-SM39 which had log D7.4 values determined via shake-flask assay. PAMPA assay data includes effective permeability, membrane retention, and log of the apparent permeability coefficient. Permeabilities for compounds SM33, SM35, and SM39 were not determined. Compounds SM35, SM36 and SM37 are single cis configuration isomers. All other compounds are not chiral
Fig. 2
Fig. 2
For each molecule in the SAMPL7 pKa challenge we asked participants to predict the relative free energy between our selected neutral reference microstate and the rest of the enumerated microstates for that molecule. In this case, we asked for the relative state free energy including the proton free energy, which could also be called the reaction free energy for the microstate transition which has the reference state as the reactant and the alternate state as the product. Using SM43 as an example, participants were asked to predict the relative free energy between SM43_micro000 (our selected neutral microstate highlighted in yellow) and all of the other enumerated microstates (SM43_micro001–SM43_micro005) for a total of 5 relative state free energies (ΔGBA, ΔGCA, ΔGDA, ΔGEA, ΔGFA). Some transitions involved a change in a single protonation state (e.g. the D–A transition of Figure 2) or tautomer (e.g. the C–A transition of Figure 2). A few cases involved a change of multiple protons (e.g. the F–A transition of Figure 2). All transitions were defined as away from the neutral reference state. Distinct microstates are defined as all tautomers of each charge state. For each relative free energy prediction reported, participants also submitted the formal charge after transitioning from the selected neutral state to the other state. For example, the reported charge state after transitioning from SM43_micro000 to SM43_micro001 would be − 1, SM43_micro000 to SM43_micro004 would be 0 (these are tautomers of each other), SM43_micro000 to SM43_micro005 would be + 1, and SM43_micro000 to SM43_micro003 would be + 2
Fig. 3
Fig. 3
Using the microstate probability to convert microscopic pKa predictions to macroscopic pKa’s with the titration method pKa’s. Blue and orange lines represent two states. Blue states have one more proton than the orange states, and thus a formal charge higher by + 1. The blue state has one tautomer and the orange state has 3, denoted by the dashed lines. The solid lines are the ensemble averaged state probability for each group with a given charge. The crossing point between two ensemble lines is the macroscopic pKa
Fig. 4
Fig. 4
Overall accuracy assessment for all methods participating in the SAMPL7 log P challenge shows that many methods did not exhibit statistically significant differences in performance and there was no single clear winner; however, empirical methods tended to perform better in general. Both root-mean-square error (RMSE) and mean absolute error (MAE) are shown, with error bars denoting 95% confidence intervals obtained by bootstrapping over challenge molecules. Empirical methods outperform the majority of the other methods. Methods that achieved a RMSE 1.0 log P units were mainly empirical based, and some were QM-based physical methods. Submitted methods are listed in Table 1. The submission REF1 ChemAxon [80] was a reference method included after the blind challenge submission deadline, and NULL0 mean cLogP FDA is the null prediction method; all others refer to blind predictions
Fig. 5
Fig. 5
Predicted vs. experimental value correlation plots of 5 best performing methods and one representative average method in the SAMPL7 log P challenge. Dark and light green shaded areas indicate 0.5 and 1.0 units of error. Error bars indicate standard error of the mean of predicted and experimental values. In some cases, log P SEM values are too small to be seen under the data points. The best-performing methods were made up of three empirical methods (ClassicalGSG DB3 [85], TFE MLR [87], Chemprop [88]) and two QM-based physical methods (COSMO-RS [89], TFE-NHLBI-TZVP-QM). Details of the methods can be found in "A shortlist of consistently well-performing methods in the pKa challenge" sect. and performance statistics are available in 2. Method NES-1 (GAFF2/OPC3 G) was selected as the representative average method, which has a median RMSE
Fig. 6
Fig. 6
Molecule-wise prediction accuracy in the log P challenge point to isoxazoles as poorly predicted, especially by MM-based physical methods. Molecules are labeled with their compound class as a reference. A The MAE calculated for each molecule as an average of all methods. B The MAE of each molecule separated by method category. C log P prediction error distribution for each molecule across all prediction methods
Fig. 7
Fig. 7
Overall accuracy assessment for all methods participating in the SAMPL7 pKa challenge shows that two methods, one a Physical (QM) method and one a QSPR/ML, performed better than other methods. Both root-mean-square error (RMSE) and mean absolute error (MAE) are shown, with error bars denoting 95% confidence intervals obtained by bootstrapping over challenge molecules. REF00_Chemaxon_Chemicalize [80] is a reference method that was included after the blind challenge submission deadline, and all other method names refer to blind predictions. Methods are listed out in Table 3 and statistics calculated for all methods are available in Table S3
Fig. 8
Fig. 8
Overall correlation assessment for all methods participating in the SAMPL7 pKa challenge shows that one Physical (QM) method and one QSPR/ML reference method exhibited modestly better performance than others. Pearson’s R2 and Kendall’s Rank Correlation Coefficient Tau (τ) are shown, with error bars denoting 95% confidence intervals obtained by bootstrapping over challenge molecules. Submission methods are listed out in Table 3. REF00_Chemaxon_Chemicalize [80] is a reference method that was included after the blind challenge submission deadline, and all other method names refer to blind predictions. Most methods have a statistically indistinguishable performance on ranking, however, for R2, two methods (EC_RISM [92], REF_Chemaxon_Chemicalize), tend to have a greater ranking ability than the other methods. Evaluation statistics calculated for all methods are available in Table S3 of the Supplementary Information
Fig. 9
Fig. 9
Predicted vs. experimental value correlation plots of 2 best performing methods and one representative average method in the SAMPL7 pKa challenge. Dark and light green shaded areas indicate 0.5 and 1.0 units of error. Error bars indicate standard error of the mean of predicted and experimental values. Some SEM values are too small to be seen under the data points. Method DFT_M05-2X_SMD [94] was selected as the method with the median RMSE of all ranked methods analyzed in the challenge. Performance statistics of these methods is available in Table 4
Fig. 10
Fig. 10
Molecule-wise prediction error distribution plots show the prediction accuracy for individual molecules across all prediction methods for the pKa challenge. Molecules are labeled with their compound class as a reference. A The MAE of each molecule separated by method category suggests the most challenging molecules were different for each method category. It is difficult to draw statistically significant conclusions where there are large overlapping confidence intervals. The QM+LEC method category appears to be less accurate for the majority of the molecules compared to the other method categories. QSPR/ML methods performed better for isoxazoles (SM41-SM43) and 1,2,3-triazoles (SM44-SM46) compared to the other two method categories. Physical QM-based methods performed poorly for acylsulfonamides (SM26 and SM25). B Error distribution for each molecule over all prediction methods. SM25 has the most spread in pKa prediction error
Fig. 11
Fig. 11
Chemical transformations that lead to common sign disagreements among participants typically involve a protonated nitrogen in terminal nitrogen groups, 1,2,3-triazoles, and isoxazoles. Shown are some chemical transformations that repeatedly show up as having large disagreement on the sign of the relative free energy prediction, as seen in Fig. 13
Fig. 12
Fig. 12
The average relative microstate free energy predicted per microstate and the distribution across predictions in the SAMPL7 pKa challenge show how varied predictions were. Molecules are labeled with their compound class as a reference. A The average relative microstate free energy predicted per microstate. Error bars are the standard deviation of the relative microstate free energy predictions. A lower standard deviation indicates that predictions for a microstate generally agree, while a larger standard deviation means that predictions disagree. Predictions made for microstates such as SM25_micro001, SM26_micro002, SM28_micro001, SM43_micro003, SM46_micro003 widely disagree, while predictions for microstates such as SM26_micro004 are in agreement. B Distribution for each relative microstate free energy prediction over all prediction methods shows how prediction agreement among methods varied depending on the microstate
Fig. 13
Fig. 13
The Shannon entropy (H) per microstate transition shows that participants disagree on many of the signs of the relative free energy predictions. Microstates with entropy values greater than 0 reflect increasing disagreement in the predicted sign. Microstates with an entropy of 0 are not shown here, but indicate that methods made predictions which had the same sign for the free energy change associated with a particular transition. About 44% of all microstates predictions disagreed with one another based on the sign, and the rest agreed. Roughly 5% of microstates strongly disagreed on the sign of predictions—meaning that predicted relative free energies were fairly evenly split between positive, neutral, and negative values. This indicates that these transitions were particularly challenging
Fig. 14
Fig. 14
Structures of microstates where relative microstate free energy predictions disagree. Shown are some of the microstate transitions where participants predictions largely disagree with one another, based on Fig. 12. The average relative free energy prediction (ΔG) along with the standard deviation are listed under each transition
Fig. 15
Fig. 15
Overall accuracy assessment for log D estimation. Both root-mean-square error (RMSE) and mean absolute error (MAE) are shown, with error bars denoting 95% confidence intervals obtained by bootstrapping over challenge molecules. REF00_ChemAxon [80] is a reference method and NULL0 is a null method that was included after the blind challenge submission deadline, and all other method names refer to blind predictions. Methods are listed out in Table 5 and statistics calculated for all methods are available in Table S4
Fig. 16
Fig. 16
Predicted vs. experimental value correlation plots of all log D estimation methods in the SAMPL7 challenge. Dark and light green shaded areas indicate 0.5 and 1.0 units of error. Error bars indicate standard error of the mean of predicted and experimental values. Some SEM values are too small to be seen under the data points. Performance statistics of all methods is available in Table S4
Fig. 17
Fig. 17
log D values from a combination of the best pKa and log P are typically superior. Shown is the RMSE in calculated log D values, with error bars denoting 95% confidence intervals from bootstrapping over challenge molecules. This plot is similar to Fig. 4, except it includes some additional pKa and log P combinations (for log D estimation). Method logP_experimental + EC_RISM combines the experimental log P with the top performing pKa method (based on RMSE). Method logP_experimental + pKa_experimental combines the experimental log P and pKa value. Method TFE MLR + EC_RISM combines the best performing (based on RMSE) log P and pKa methods. Method TFE MLR + pKa_experimental combines the best performing (based on RMSE) log P method with the experimental pKa. Method logP_experimental + DFT_M05-2X_SMD combines the experimental log P with an average performing pKa method. Method NES-1 (GAFF2/OPC3) B + pKa_experimental combines a log P method with average performance with the experimental pKa. All other methods are the same as in Fig. 4

Similar articles

Cited by

References

    1. Manallack DT. The p Ka distribution of drugs: application to drug discovery. Perspect Med Chem. 2007 doi: 10.1177/1177391X0700100003. - DOI - PMC - PubMed
    1. Charifson PS, Walters WP. Acidic and basic drugs in medicinal chemistry: a perspective. J Med Chem. 2014;57(23):9701–9717. doi: 10.1021/jm501000a. - DOI - PubMed
    1. Aguilar B, Anandakrishnan R, Ruscio JZ, Onufriev AV. Statistics and physical origins of pK and ionization state changes upon protein-ligand binding. Biophys J. 2010;98(5):872–880. doi: 10.1016/j.bpj.2009.11.016. - DOI - PMC - PubMed
    1. Rupp M, Korner RV, Tetko I. Predicting the pKa of small molecules. CCHTS. 2011;14(5):307–327. doi: 10.2174/138620711795508403. - DOI - PubMed
    1. Meanwell NA. Improving drug candidates by design: a focus on physicochemical properties as a means of improving compound disposition and safety. Chem Res Toxicol. 2011;24(9):1420–1456. doi: 10.1021/tx200211v. - DOI - PubMed

Publication types

LinkOut - more resources