Predicting human liver microsomal stability with machine learning techniques

J Mol Graph Model. 2008 Feb;26(6):907-15. doi: 10.1016/j.jmgm.2007.06.005. Epub 2007 Jun 27.


To ensure a continuing pipeline in pharmaceutical research, lead candidates must possess appropriate metabolic stability in the drug discovery process. In vitro ADMET (absorption, distribution, metabolism, elimination, and toxicity) screening provides us with useful information regarding the metabolic stability of compounds. However, before the synthesis stage, an efficient process is required in order to deal with the vast quantity of data from large compound libraries and high-throughput screening. Here we have derived a relationship between the chemical structure and its metabolic stability for a data set of in-house compounds by means of various in silico machine learning such as random forest, support vector machine (SVM), logistic regression, and recursive partitioning. For model building, 1952 proprietary compounds comprising two classes (stable/unstable) were used with 193 descriptors calculated by Molecular Operating Environment. The results using test compounds have demonstrated that all classifiers yielded satisfactory results (accuracy > 0.8, sensitivity > 0.9, specificity > 0.6, and precision > 0.8). Above all, classification by random forest as well as SVM yielded kappa values of approximately 0.7 in an independent validation set, slightly higher than other classification tools. These results suggest that nonlinear/ensemble-based classification methods might prove useful in the area of in silico ADME modeling.

MeSH terms

  • Artificial Intelligence*
  • Computer Simulation
  • Drug Evaluation, Preclinical / methods
  • Drug Stability
  • Humans
  • Logistic Models
  • Microsomes, Liver / metabolism*
  • Predictive Value of Tests
  • Quantitative Structure-Activity Relationship
  • Reproducibility of Results
  • Sensitivity and Specificity