Development of a Support Vector Machine-Based System to Predict Whether a Compound Is a Substrate of a Given Drug Transporter Using Its Chemical Structure

J Pharm Sci. 2016 Jul;105(7):2222-30. doi: 10.1016/j.xphs.2016.04.023. Epub 2016 Jun 1.


The aim of this study was to develop an in silico prediction system to assess which of 7 categories of drug transporters (organic anion transporting polypeptide [OATP] 1B1/1B3, multidrug resistance-associated protein [MRP] 2/3/4, organic anion transporter [OAT] 1, OAT3, organic cation transporter [OCT] 1/2/multidrug and toxin extrusion [MATE] 1/2-K, multidrug resistance protein 1 [MDR1], and breast cancer resistance protein [BCRP]) can recognize compounds as substrates using its chemical structure alone. We compiled an internal data set consisting of 260 compounds that are substrates for at least 1 of the 7 categories of drug transporters. Four physicochemical parameters (charge, molecular weight, lipophilicity, and plasma unbound fraction) of each compound were used as the basic descriptors. Furthermore, a greedy algorithm was used to select 3 additional physicochemical descriptors from 731 available descriptors. In addition, transporter nonsubstrates tend not to be in the public domain; we, thus, tried to compile an expert-curated data set of putative nonsubstrates for each transporter using personal opinions of 11 researchers in the field of drug transporters. The best prediction was finally achieved by a support vector machine based on 4 basic and 3 additional descriptors. The model correctly judged that 364 of 412 compounds (internal data set) and 111 of 136 compounds (external data set) were substrates, indicating that this model performs well enough to predict the specificity of transporter substrates.

Keywords: QSAR; computational ADME; high-throughput technologies; in silico modeling; transporters.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biological Transport
  • Carrier Proteins / metabolism*
  • Computer Simulation
  • Lipids / chemistry
  • Molecular Weight
  • Multidrug Resistance-Associated Proteins / metabolism
  • Pharmaceutical Preparations / chemistry*
  • Pharmaceutical Preparations / metabolism*
  • Predictive Value of Tests
  • Substrate Specificity
  • Support Vector Machine*


  • Carrier Proteins
  • Lipids
  • Multidrug Resistance-Associated Proteins
  • Pharmaceutical Preparations