Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jun 20;15:210.
doi: 10.1186/1471-2105-15-210.

Exploiting Large-Scale Drug-Protein Interaction Information for Computational Drug Repurposing

Affiliations
Free PMC article

Exploiting Large-Scale Drug-Protein Interaction Information for Computational Drug Repurposing

Ruifeng Liu et al. BMC Bioinformatics. .
Free PMC article

Abstract

Background: Despite increased investment in pharmaceutical research and development, fewer and fewer new drugs are entering the marketplace. This has prompted studies in repurposing existing drugs for use against diseases with unmet medical needs. A popular approach is to develop a classification model based on drugs with and without a desired therapeutic effect. For this approach to be statistically sound, it requires a large number of drugs in both classes. However, given few or no approved drugs for the diseases of highest medical urgency and interest, different strategies need to be investigated.

Results: We developed a computational method termed "drug-protein interaction-based repurposing" (DPIR) that is potentially applicable to diseases with very few approved drugs. The method, based on genome-wide drug-protein interaction information and Bayesian statistics, first identifies drug-protein interactions associated with a desired therapeutic effect. Then, it uses key drug-protein interactions to score other drugs for their potential to have the same therapeutic effect.

Conclusions: Detailed cross-validation studies using United States Food and Drug Administration-approved drugs for hypertension, human immunodeficiency virus, and malaria indicated that DPIR provides robust predictions. It achieves high levels of enrichment of drugs approved for a disease even with models developed based on a single drug known to treat the disease. Analysis of our model predictions also indicated that the method is potentially useful for understanding molecular mechanisms of drug action and for identifying protein targets that may potentiate the desired therapeutic effects of other drugs (combination therapies).

Figures

Figure 1
Figure 1
Schematic bit-string representation of a drug-human protein interaction profile. Each protein is represented by 3 bits to encode drug binding, drug activation, and drug inhibition of the protein, respectively. When a drug has been reported or predicted to bind, activate, or inhibit a protein, the bit representing the specific drug-protein interaction is turned on (assigned a value of 1). Otherwise, the bit is off (assigned a value of 0). M denotes the number of drugs with an approved indication (positive class), N denotes the total number of drugs, fi represents on-bit or off-bit of the i-th bit feature, Ai denotes the number of on-bits of the i-th bit feature in the positive class, and Bi denotes the number of on-bits of the i-th bit feature in all drugs.
Figure 2
Figure 2
Three machine learning approaches for developing data-driven drug repurposing models. The total number of drugs and drug development candidates with drug-protein interaction profiles is 4,902. m denotes the number of drugs with a desirable therapeutic effect (positive class), n represents a subset of m used as the positive class of the training set for model development, and k denotes the number of drugs that do not have a desired therapeutic effect but can be used as false positives (FP) for the purpose of model development. TP: true positive.
Figure 3
Figure 3
Performance comparison between type I and type II models. Comparison of enrichment efficiencies of type I (A-C) and type II (D-F) models for high blood pressure (HBP), HIV, and antimalarial drugs. The models were built with one, two, and three drugs in the positive class of the training set. Bar heights denote the fraction of FDA-approved HBP, HIV, and antimalarial drugs in the testing set (type I models) or baseline class (type II models) that scored in the highest 1%, 5%, and 10% of the compounds, respectively. Error bars represent 1 standard deviation from full cross-validation calculations.
Figure 4
Figure 4
Impact of false positive on the performance of type II models. The models were built with one to three true positives and either zero or one false positive. Error bars represent 1 standard deviation from full cross-validation calculations. The random bar heights represent the expected fractions of positive drugs in 1%, 5%, and 10% randomly picked baseline compounds. Models constructed with no false positives correspond to type II models (Figure  3,D-F). A: High blood pressure (HBP) model. B: Human immunodeficiency virus (HIV) model. C: antimalarial model. TP: true positive. FP: false positive.

Similar articles

See all similar articles

Cited by 4 articles

References

    1. Paul SM, Lewis-Hall F. Drugs in search of diseases. Sci Transl Med. 2013;5(186):186fs118. - PubMed
    1. Drews J. Drug discovery: a historical perspective. Science. 2000;287(5460):1960–1964. - PubMed
    1. Paul SM, Mytelka DS, Dunwiddie CT, Persinger CC, Munos BH, Lindborg SR, Schacht AL. How to improve R&D productivity: the pharmaceutical industry's grand challenge. Nat Rev Drug Discov. 2010;9(3):203–214. - PubMed
    1. Sonner JM, Cantor RS. Molecular mechanisms of drug action: an emerging view. Annu Rev Biophys. 2013;42:143–167. - PubMed
    1. Ghofrani HA, Osterloh IH, Grimminger F. Sildenafil: from angina to erectile dysfunction to pulmonary hypertension and beyond. Nat Rev Drug Discov. 2006;5(8):689–702. - PubMed

Publication types

Feedback