Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 May 29;126(5):057008.
doi: 10.1289/EHP2998. eCollection 2018 May.

Conditional Toxicity Value (CTV) Predictor: An In Silico Approach for Generating Quantitative Risk Estimates for Chemicals

Affiliations
Free PMC article

Conditional Toxicity Value (CTV) Predictor: An In Silico Approach for Generating Quantitative Risk Estimates for Chemicals

Jessica A Wignall et al. Environ Health Perspect. .
Free PMC article

Abstract

Background: Human health assessments synthesize human, animal, and mechanistic data to produce toxicity values that are key inputs to risk-based decision making. Traditional assessments are data-, time-, and resource-intensive, and they cannot be developed for most environmental chemicals owing to a lack of appropriate data.

Objectives: As recommended by the National Research Council, we propose a solution for predicting toxicity values for data-poor chemicals through development of quantitative structure-activity relationship (QSAR) models.

Methods: We used a comprehensive database of chemicals with existing regulatory toxicity values from U.S. federal and state agencies to develop quantitative QSAR models. We compared QSAR-based model predictions to those based on high-throughput screening (HTS) assays.

Results: QSAR models for noncancer threshold-based values and cancer slope factors had cross-validation-based Q2 of 0.25-0.45, mean model errors of 0.70-1.11 log10 units, and applicability domains covering >80% of environmental chemicals. Toxicity values predicted from QSAR models developed in this study were more accurate and precise than those based on HTS assays or mean-based predictions. A publicly accessible web interface to make predictions for any chemical of interest is available at http://toxvalue.org.

Conclusions: An in silico tool that can predict toxicity values with an uncertainty of an order of magnitude or less can be used to quickly and quantitatively assess risks of environmental chemicals when traditional toxicity data or human health assessments are unavailable. This tool can fill a critical gap in the risk assessment and management of data-poor chemicals. https://doi.org/10.1289/EHP2998.

Figures

Figures 1A and 1B are three-dimensional scatter plots. Figures 1C, 1D and 1E are histograms plotting relative frequency (y-axis) across ratio of Octanol to Water Partition Coefficient (AlogP), molecular weight (MW), and topological polar surface area, respectively, for the CTV and CERAPP.
Figure 1.
Chemical space coverage of 886 Conditional toxicity value (CTV) chemicals compared to 32,464 structures in the Collaborative Estrogen Receptor Activity Prediction Project (CERAPP) “prediction” data set, based on chemical descriptors generated from the Chemistry Development Kit (CDK). (A) Three-dimensional (3-D) plot of coverage comparison based on first three principal components (PC); (BE) Coverage comparisons based on easily interpretable descriptors octanol:water partition coefficient (ALogP), molecular weight (MW), and topological polar surface area (TopoPSA). Panel (B) is a 3-D plot, and panels (CE) are histograms comparing the distribution of compounds for each descriptor. The 3-D plots were created in R (version 3.3.2; R Core Team).
Figures 2A, 2B, 2C, and 2D are scatter plots with regression lines plotting observed value (log 10 mole) (y-axis) across prediction (log 10 mole) (x-axis) for RfD, RfD NOAEL, RfD BMD, and RfD BMDL, respectively. Figures 2E, 2F, 2G, and 2H are line graphs of model prediction and using mean observation plotting percentage with greater than absolute error (y-axis) across absolute error (log 10 mole) (x-axis) for RfD, RfD NOAEL, RfD BMD, and RfD BMDL, respectively. Subparts E, F, G and H also comprise box and whisker plots.
Figure 2.
Conditional toxicity value (CTV) prediction errors based on 5-fold cross-validation for the RfD and associated NOAEL, BMD, or BMDL. (AD) Scatter plots of QSAR-predicted versus “observed” regulatory toxicity values, with the dotted lines showing the range encompassing 90% of the predictions. (EH) Comparison of the distributions of absolute prediction errors for CTV (solid line) and an alternative model using the observed mean value for every prediction (dashed line). Box plots show the corresponding interquartile region (box), median (line in box), mean (dot in box), and 95% confidence interval (whiskers). Note: Abs., absolute; BMD, benchmark dose; BMDL, benchmark dose lower confidence limit; NOAEL, no observed adverse effect level; QSAR, quantitative structure–activity relationship; RfD, reference dose.
Figures 3A and 3B are scatter plots with regression lines plotting observed value (log 10 mole) (y-axis) across prediction (log 10 mole) (x-axis) for OSF and CPV, respectively. Figures 3C and 3D are line graphs of model prediction and using mean observation plotting percentage with greater than absolute error (y-axis) across absolute error (log 10 mole) (x-axis) for OSF and CPV, respectively. Subparts C and D also comprise box and whisker plots.
Figure 3.
Conditional toxicity value (CTV) prediction errors based on 5-fold cross-validation for the oral slope factor (OSF) and cancer potency value (CPV). (AB) Scatter plots of QSAR-predicted versus “observed” regulatory toxicity values, with the dotted lines showing the range encompassing 90% of the predictions. (CD) Comparison of the distributions of absolute prediction errors for CTV (solid line) and an alternative model using the observed mean value for every prediction (dashed line). Box plots show the corresponding interquartile region (box), median (line in box), mean (dot in box), and 95% confidence interval (whiskers). Note: Abs., absolute; QSAR, quantitative structure–activity relationship.
Figures 4A and 4B are scatter plots with regression lines plotting observed value (log 10 mole) (y-axis) across prediction (log 10 mole) (x-axis) for RfC and IUR, respectively. Figures 3C and 3D are line graphs of model prediction and using mean observation plotting percentage with greater than absolute error (y-axis) across absolute error (log 10 mole) (x-axis) for RfC and IUR, respectively. Subparts C and D also comprise box and whisker plots.
Figure 4.
Conditional toxicity value (CTV) prediction errors based on 5-fold cross-validation for the reference concentration (RfD) and inhalation unit risk (IUR). (A, B) Scatter plots of QSAR-predicted versus “observed” regulatory toxicity values, with the dotted lines showing the range encompassing 90% of the predictions. (C, D) Comparison of the distributions of absolute prediction errors for CTV (solid line) and an alternative model using the observed mean value for every prediction (dashed line). Box plots show the corresponding interquartile region (box), median (line in box), mean (dot in box), and 95% confidence interval (whiskers). Note: Abs., absolute; QSAR, quantitative structure–activity relationship.
Figure 5A is a line graph plotting descriptor rank (y-axis) across frequency of use (x-axis) for RfD. Figure 5B is a scatter plot with descriptors (y-axis) across their frequency of use (x-axis). Figure 5C plots comparison of the descriptor values ALogP (y-axis) across BCUTp.1l (x-axis) for the top 10 percent RfD and bottom 10 percent RfD.
Figure 5.
Example of mechanistic interpretation of QSAR model for the RfD. (A) Ranks of the molecular descriptors by their frequency of use in the model. The top twenty are denoted by dashed lines. (B) The top twenty descriptors shown separately, with the descriptor names. (C) comparison of the descriptor values for the top two descriptors between the highest and lowest potency RfDs. Note: QSAR, quantitative structure activity relationship; RfD, reference dose. Definitions of each molecular descriptor can be found online: https://cdk.github.io/cdk/1.5/docs/api/org/openscience/cdk/qsar/descriptors/molecular/package-summary.html.
Screen shots of the conditional toxicity value predictor web page.
Figure 6.
Screen shots illustrating use of the Conditional Toxicity Value (CTV) Predictor web portal (http://toxvalue.org). Steps include searching for a chemical, verifying its identity, selecting toxicity values of interest, running quantitative structure–activity relationship (QSAR) models, and exporting predictions to a comma-separated-values file. The entire process can be completed in approximately 60 s. CTV is a public web portal maintained by two of the authors (IR and WAC).
Figure 7A is a scatter plot with regression lines plotting regulatory NOAEL (y-axis) across cross-validation CTV NOAEL (x-axis) where R squared equals 0.47 (n equals 36). Figure 7B is a scatter plot with regression lines plotting regulatory NOAEL (y-axis) across ToxCast OED05 (x-axis) where R squared equals 0.12 (n equals 36). Figure 7C is a scatter plot with regression lines plotting regulatory BMDL (y-axis) across cross-validation CTV BMDL (x-axis) where R squared equals 0.59 (n equals 14). Figure 7D is a scatter plot with regression lines plotting regulatory BMDL (y-axis) across ToxCast OED05 (x-axis) where R squared equals 0.061 (n equals 14). Figure 7E is a scatter plot with regression lines plotting regulatory RfD (y-axis) across cross-validation CTV RfD (x-axis) where R squared equals 0.36 (n equals 51). Figure 7F is a scatter plot with regression lines plotting regulatory RfD (y-axis) across ToxCast OED05 (x-axis) where R squared equals 0.087 (n equals 51).
Figure 7.
Comparison of conditional toxicity value (CTV)-based (A, C, E) or high-throughput screening (HTS) assay-based (B, D, F) toxicity value predictions with “gold standard” regulatory toxicity values. In each panel, the x-axis is the toxicity value predicted from CTV (left panels) or based on HTS assays (right panels), which is compared with regulatory toxicity values on the y-axis (all values are in units of mg/kg·d). Comparisons are made for regulatory NOAELs (panels A and B), BMDLs (panels C and D), or RfDs (panels E and F). In all cases, the predictions from CTV are based on cross-validation (panels A, C, and E). Panels AE also include lines indicating equality and a factor of 10 greater or less than equality (solid and dotted lines); for panel F, the line is offset by a factor of 102.5300 to account for the fact that the HTS-based oral equivalent dose lower 5% confidence limit (OED05) is a point of departure and is not meant to be equivalent to an RfD. The offset is approximately equivalent to treating OED05/300 as a surrogate for the RfD. The value of this offset was determined by the intercept of the linear regression. Each panel also includes linear regression lines (dashed lines), along with the number of compounds (n) and the adjusted R2. Note: BMDL, benchmark dose lower confidence limit; HTS OED05, high-throughput screening–based oral equivalent dose lower 5% confidence limit; NOAEL, no observed adverse effect level; RfD, reference dose.

Similar articles

See all similar articles

Cited by 1 article

References

    1. Anderson E, Veith G, Weininger D. 1987. “SMILES: A Line Notation and Computerized Interpreter for Chemical Structures.” EPA/600/M-87/021 (NTIS PB88130034). Washington, DC:U.S. Environmental Protection Agency; https://cfpub.epa.gov/si/si_public_record_report.cfm?dirEntryId=33186&keyword=syntax&actType=&TIMSType=+&TIMSSubTypeID=&DEID=&epaNumber=&ntisID=&archiveStatus=Both&ombCat=Any&dateBeginCreated=&dateEndCreated=&dateBeginPublishedPresented=&dateEndPublishedPresented=&dateBeginUpdated=&dateEndUpdated=&dateBeginCompleted=&dateEndCompleted=&personID=&role=Any&journalID=&publisherID=&sortBy=revisionDate&count=50&CFID=61591074&CFTOKEN=96057771 [accessed 8 May 2018].
    1. Bhatia S, Schultz T, Roberts D, Shen J, Kromidas L, Marie Api A. 2015. Comparison of Cramer classification between Toxtree, the OECD QSAR Toolbox and expert judgment. Regul Toxicol Pharmacol 71(1):52–62, PMID: 25460032, 10.1016/j.yrtph.2014.11.005. - DOI - PubMed
    1. Bhhatarai B, Wilson DM, Parks AK, Carney EW, Spencer PJ. 2016. Evaluation of TOPKAT, Toxtree, and Derek Nexus in silico models for ocular irritation and development of a knowledge-based framework to improve the prediction of severe irritation. Chem Res Toxicol 29(5):810–822, PMID: 27018716, 10.1021/acs.chemrestox.5b00531. - DOI - PubMed
    1. CDC (Centers for Disease Control and Prevention). 2014. “Summary Report of Short-Term Screening Level Calculation and Analysis of Available Animal Studies for MCHM.” Atlanta, GA:Centers for Disease Control and Prevention.
    1. Chiu WA, Axelrad DA, Dalaijamts C, Dockins C, Shao K, Shapiro AJ, Paoli G. In Press. Beyond the RfD: broad application of a probabilistic approach to improve chemical dose-response assessment for non-cancer effects. Environ Health Perspect 10.1289/EHP3368. - DOI - PMC - PubMed

Publication types

LinkOut - more resources

Feedback