Ligand and structure-based classification models for prediction of P-glycoprotein inhibitors

J Chem Inf Model. 2014 Jan 27;54(1):218-29. doi: 10.1021/ci400289j. Epub 2014 Jan 9.


The ABC transporter P-glycoprotein (P-gp) actively transports a wide range of drugs and toxins out of cells, and is therefore related to multidrug resistance and the ADME profile of therapeutics. Thus, development of predictive in silico models for the identification of P-gp inhibitors is of great interest in the field of drug discovery and development. So far in silico P-gp inhibitor prediction was dominated by ligand-based approaches because of the lack of high-quality structural information about P-gp. The present study aims at comparing the P-gp inhibitor/noninhibitor classification performance obtained by docking into a homology model of P-gp, to supervised machine learning methods, such as Kappa nearest neighbor, support vector machine (SVM), random fores,t and binary QSAR, by using a large, structurally diverse data set. In addition, the applicability domain of the models was assessed using an algorithm based on Euclidean distance. Results show that random forest and SVM performed best for classification of P-gp inhibitors and noninhibitors, correctly predicting 73/75% of the external test set compounds. Classification based on the docking experiments using the scoring function ChemScore resulted in the correct prediction of 61% of the external test set. This demonstrates that ligand-based models currently remain the methods of choice for accurately predicting P-gp inhibitors. However, structure-based classification offers information about possible drug/protein interactions, which helps in understanding the molecular basis of ligand-transporter interaction and could therefore also support lead optimization.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • ATP Binding Cassette Transporter, Subfamily B / antagonists & inhibitors*
  • ATP Binding Cassette Transporter, Subfamily B / chemistry*
  • Algorithms
  • Animals
  • Artificial Intelligence
  • Binding Sites
  • Computational Biology
  • Computer Simulation
  • Databases, Chemical
  • Drug Discovery
  • Humans
  • Ligands
  • Models, Molecular
  • Principal Component Analysis
  • Protein Structure, Tertiary
  • Quantitative Structure-Activity Relationship
  • Structural Homology, Protein
  • Support Vector Machine


  • ATP Binding Cassette Transporter, Subfamily B
  • Ligands