P-Glycoprotein (P-gp, ABCB1) plays a significant role in determining the ADMET properties of drugs and drug candidates. Substrates of P-gp are not only subject to multidrug resistance (MDR) in tumor therapy, they are also associated with poor pharmacokinetic profiles. In contrast, inhibitors of P-gp have been advocated as modulators of MDR. However, due to the polyspecificity of P-gp, knowledge on the molecular basis of ligand-transporter interaction is still poor, which renders the prediction of whether a compound is a P-gp substrate/non-substrate or an inhibitor/non-inhibitor quite challenging. In the present investigation, we used a set of fingerprints representing the presence/absence of various functional groups for machine learning based classification of a set of 484 substrates/non-substrates and a set of 1935 inhibitors/non-inhibitors. Best models were obtained using a combination of a wrapper subset evaluator (WSE) with random forest (RF), kappa nearest neighbor (kNN) and support vector machine (SVM), showing accuracies >70%. Best P-gp substrate models were further validated with three sets of external P-gp substrate sources, which include Drug Bank (n = 134), TP Search (n = 90) and a set compiled from literature (n = 76). Association rule analysis explores the various structural feature requirements for P-gp substrates and inhibitors.
Copyright © 2012 Elsevier Ltd. All rights reserved.