A hybrid mixture discriminant analysis-random forest computational model for the prediction of volume of distribution of drugs in human

Franco Lombardo; R Scott Obach; Frank M Dicapua; Gregory A Bakken; Jing Lu; David M Potter; Feng Gao; Michael D Miller; Yao Zhang

doi:10.1021/jm050200r

A hybrid mixture discriminant analysis-random forest computational model for the prediction of volume of distribution of drugs in human

J Med Chem. 2006 Apr 6;49(7):2262-7. doi: 10.1021/jm050200r.

Authors

Franco Lombardo¹, R Scott Obach, Frank M Dicapua, Gregory A Bakken, Jing Lu, David M Potter, Feng Gao, Michael D Miller, Yao Zhang

Affiliation

¹ Computational Chemistry and Scientific Computing Groups and Groton Non-Clinical Statistics, Pfizer Global Research and Development, Groton Laboratories, Groton, Connecticut 06340, USA. franco.lombardo@novartis.com

PMID: 16570922
DOI: 10.1021/jm050200r

Abstract

A computational approach is described that can predict the VD(ss) of new compounds in humans, with an accuracy of within 2-fold of the actual value. A dataset of VD values for 384 drugs in humans was used to train a hybrid mixture discriminant analysis-random forest (MDA-RF) model using 31 computed descriptors. Descriptors included terms describing lipophilicity, ionization, molecular volume, and various molecular fragments. For a test set of 23 proprietary compounds not used in model construction, the geometric mean fold-error (GMFE) was 1.78-fold (+/-11.4%). The model was also tested using a leave-class out approach wherein subsets of drugs based on therapeutic class were removed from the training set of 384, the model was recast, and the VD(ss) values for each of the subsets were predicted. GMFE values ranged from 1.46 to 2.94-fold, depending on the subset. Finally, for an additional set of 74 compounds, VD(ss) predictions made using the computational model were compared to predictions made using previously described methods dependent on animal pharmacokinetic data. Computational VD(ss) predictions were, on average, 2.13-fold different from the VD(ss) predictions from animal data. The computational model described can predict human VD(ss) with an accuracy comparable to predictions requiring substantially greater effort and can be applied in place of animal experimentation.

MeSH terms

Algorithms
Computer Simulation
Drug Design
Humans
Models, Biological*
Pharmaceutical Preparations / metabolism*
Pharmacokinetics*
Tissue Distribution

Substances

Pharmaceutical Preparations