Predicting human liver microsomal stability with machine learning techniques

Yojiro Sakiyama; Hitomi Yuki; Takashi Moriya; Kazunari Hattori; Misaki Suzuki; Kaoru Shimada; Teruki Honma

doi:10.1016/j.jmgm.2007.06.005

Predicting human liver microsomal stability with machine learning techniques

J Mol Graph Model. 2008 Feb;26(6):907-15. doi: 10.1016/j.jmgm.2007.06.005. Epub 2007 Jun 27.

Authors

Yojiro Sakiyama¹, Hitomi Yuki, Takashi Moriya, Kazunari Hattori, Misaki Suzuki, Kaoru Shimada, Teruki Honma

Affiliation

¹ Research Planning and Coordination, Nagoya Laboratories, Pfizer Global Research and Development, Pfizer Inc., 5-2 Taketoyo, Aichi 470-2393, Japan. Yojiro.Sakiyama@pfizer.com

PMID: 17683964
DOI: 10.1016/j.jmgm.2007.06.005

Abstract

To ensure a continuing pipeline in pharmaceutical research, lead candidates must possess appropriate metabolic stability in the drug discovery process. In vitro ADMET (absorption, distribution, metabolism, elimination, and toxicity) screening provides us with useful information regarding the metabolic stability of compounds. However, before the synthesis stage, an efficient process is required in order to deal with the vast quantity of data from large compound libraries and high-throughput screening. Here we have derived a relationship between the chemical structure and its metabolic stability for a data set of in-house compounds by means of various in silico machine learning such as random forest, support vector machine (SVM), logistic regression, and recursive partitioning. For model building, 1952 proprietary compounds comprising two classes (stable/unstable) were used with 193 descriptors calculated by Molecular Operating Environment. The results using test compounds have demonstrated that all classifiers yielded satisfactory results (accuracy > 0.8, sensitivity > 0.9, specificity > 0.6, and precision > 0.8). Above all, classification by random forest as well as SVM yielded kappa values of approximately 0.7 in an independent validation set, slightly higher than other classification tools. These results suggest that nonlinear/ensemble-based classification methods might prove useful in the area of in silico ADME modeling.

MeSH terms

Artificial Intelligence*
Computer Simulation
Drug Evaluation, Preclinical / methods
Drug Stability
Humans
Logistic Models
Microsomes, Liver / metabolism*
Predictive Value of Tests
Quantitative Structure-Activity Relationship
Reproducibility of Results
Sensitivity and Specificity