A k-nearest neighbor (k-NN) classification model was constructed for 118 RDT NEDO (Repeated Dose Toxicity New Energy and industrial technology Development Organization; currently known as the Hazard Evaluation Support System (HESS)) database chemicals, employing two acute toxicity (LD50)-based classes as a response and using a series of eight PaDEL software-derived fingerprints as predictor variables. A model developed using Estate type fingerprints correctly predicted the LD50 classes for 70 of 94 training set chemicals and 19 of 24 test set chemicals. An individual category was formed for each of the chemicals by extracting its corresponding k-analogs that were identified by k-NN classification. These categories were used to perform the read-across study for prediction of the chronic toxicity, i.e., Lowest Observed Effect Levels (LOEL). We have successfully predicted the LOELs of 54 of 70 training set chemicals (77%) and 14 of 19 test set chemicals (74%) to within an order of magnitude from their experimental LOEL values. Given the success thus far, we conclude that if the k-NN model predicts LD50 classes correctly for a certain chemical, then the k-analogs of such a chemical can be successfully used for data gap filling for the LOEL. This model should support the in silico prediction of repeated dose toxicity.
Keywords: Estate fingerprint; LD50; LOEL; category formation; classification model; k-nearest neighbor; read-across.