Using the Variable-Nearest Neighbor Method To Identify P-Glycoprotein Substrates and Inhibitors

ACS Omega. 2016 Nov 30;1(5):923-929. doi: 10.1021/acsomega.6b00247. Epub 2016 Nov 16.


Permeability glycoprotein (Pgp) is an essential membrane-bound transporter that efficiently extracts compounds from a cell. As such, it is a critical determinant of the pharmacokinetic properties of drugs. Multidrug resistance in cancer is often associated with overexpression of Pgp, which increases the efflux of chemotherapeutic agents from the cell. This, in turn, may prevent an effective treatment by reducing the effective intracellular concentrations of such agents. Consequently, identifying compounds that can either be transported out of the cell by Pgp (substrates) or impair Pgp function (inhibitors) is of great interest. Herein, using publically available data, we developed quantitative structure-activity relationship (QSAR) models of Pgp substrates and inhibitors. These models employed a variable-nearest neighbor (v-NN) method that calculated the structural similarity between molecules and hence possessed an applicability domain, that is, they used all nearest neighbors that met a minimum similarity constraint. The performance characteristics of these v-NN-based models were comparable or at times superior to those of other model constructs. The best v-NN models for identifying either Pgp substrates or inhibitors showed overall accuracies of >80% and κ values of >0.60 when tested on external data sets with candidate Pgp substrates and inhibitors. The v-NN prediction model with a well-defined applicability domain gave accurate and reliable results. The v-NN method is computationally efficient and requires no retraining of the prediction model when new assay information becomes available-an important feature when keeping QSAR models up-to-date and maintaining their performance at high levels.