Structure-based QSAR Models to Predict Repeat Dose Toxicity Points of Departure

Comput Toxicol. 2020 Nov 1;16(November 2020):10.1016/j.comtox.2020.100139. doi: 10.1016/j.comtox.2020.100139.


Human health risk assessment for environmental chemical exposure is limited by a vast majority of chemicals with little or no experimental in vivo toxicity data. Data gap filling techniques, such as quantitative structure activity relationship (QSAR) models based on chemical structure information, can predict hazard in the absence of experimental data. Risk assessment requires identification of a quantitative point-of-departure (POD) value, the point on the dose-response curve that marks the beginning of a low-dose extrapolation. This study presents two sets of QSAR models to predict POD values (PODQSAR) for repeat dose toxicity. For training and validation, a publicly available in vivo toxicity dataset for 3592 chemicals was compiled using the U.S. Environmental Protection Agency's Toxicity Value database (ToxValDB). The first set of QSAR models predict point-estimates of POD values (PODQSAR) using structural and physicochemical descriptors for repeat dose study types and species combinations. A random forest QSAR model using study type and species as descriptors showed the best performance, with an external test set root mean square error (RMSE) of 0.71 log10-mg/kg/day and coefficient of determination (R2) of 0.53. The second set of QSAR models predict the 95% confidence intervals for PODQSAR using a constructed POD distribution with a mean equal to the median POD value and a standard deviation of 0.5 log10-mg/kg/day, based on previously published typical study-to-study variability that may lead to uncertainty in model predictions. Bootstrap resampling of the pre-generated POD distribution was used to derive point-estimates and 95% confidence intervals for each POD prediction. Enrichment analysis to evaluate the accuracy of PODQSAR showed that 80% of the 5% most potent chemicals were found in the top 20% of the most potent chemical predictions, suggesting that the repeat dose POD QSAR models presented here may help inform screening level human health risk assessments in the absence of other data.