The use of machine learning and nonlinear statistical tools for ADME prediction

Expert Opin Drug Metab Toxicol. 2009 Feb;5(2):149-69. doi: 10.1517/17425250902753261.

Abstract

Absorption, distribution, metabolism and excretion (ADME)-related failure of drug candidates is a major issue for the pharmaceutical industry today. Prediction of ADME by in silico tools has now become an inevitable paradigm to reduce cost and enhance efficiency in pharmaceutical research. Recently, machine learning as well as nonlinear statistical tools has been widely applied to predict routine ADME end points. To achieve accurate and reliable predictions, it would be a prerequisite to understand the concepts, mechanisms and limitations of these tools. Here, we have devised a small synthetic nonlinear data set to help understand the mechanism of machine learning by 2D-visualisation. We applied six new machine learning methods to four different data sets. The methods include Naive Bayes classifier, classification and regression tree, random forest, Gaussian process, support vector machine and k nearest neighbour. The results demonstrated that ensemble learning and kernel machine displayed greater accuracy of prediction than classical methods irrespective of the data set size. The importance of interaction with the engineering field is also addressed. The results described here provide insights into the mechanism of machine learning, which will enable appropriate usage in the future.

Publication types

  • Review

MeSH terms

  • Artificial Intelligence*
  • Data Interpretation, Statistical
  • Drug Industry / methods
  • Humans
  • Models, Statistical*
  • Nonlinear Dynamics
  • Pharmaceutical Preparations / chemistry
  • Pharmaceutical Preparations / metabolism*
  • Pharmacokinetics
  • Quantitative Structure-Activity Relationship

Substances

  • Pharmaceutical Preparations