Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Apr 11;19(1):129.
doi: 10.1186/s12859-018-2123-4.

A Two-Tiered Unsupervised Clustering Approach for Drug Repositioning Through Heterogeneous Data Integration

Free PMC article

A Two-Tiered Unsupervised Clustering Approach for Drug Repositioning Through Heterogeneous Data Integration

Pathima Nusrath Hameed et al. BMC Bioinformatics. .
Free PMC article


Background: Drug repositioning is the process of identifying new uses for existing drugs. Computational drug repositioning methods can reduce the time, costs and risks of drug development by automating the analysis of the relationships in pharmacology networks. Pharmacology networks are large and heterogeneous. Clustering drugs into small groups can simplify large pharmacology networks, these subgroups can also be used as a starting point for repositioning drugs. In this paper, we propose a two-tiered drug-centric unsupervised clustering approach for drug repositioning, integrating heterogeneous drug data profiles: drug-chemical, drug-disease, drug-gene, drug-protein and drug-side effect relationships.

Results: The proposed drug repositioning approach is threefold; (i) clustering drugs based on their homogeneous profiles using the Growing Self Organizing Map (GSOM); (ii) clustering drugs based on drug-drug relation matrices based on the previous step, considering three state-of-the-art graph clustering methods; and (iii) inferring drug repositioning candidates and assigning a confidence value for each identified candidate. In this paper, we compare our two-tiered clustering approach against two existing heterogeneous data integration approaches with reference to the Anatomical Therapeutic Chemical (ATC) classification, using GSOM. Our approach yields Normalized Mutual Information (NMI) and Standardized Mutual Information (SMI) of 0.66 and 36.11, respectively, while the two existing methods yield NMI of 0.60 and 0.64 and SMI of 22.26 and 33.59. Moreover, the two existing approaches failed to produce useful cluster separations when using graph clustering algorithms while our approach is able to identify useful clusters for drug repositioning. Furthermore, we provide clinical evidence for four predicted results (Chlorthalidone, Indomethacin, Metformin and Thioridazine) to support that our proposed approach can be reliably used to infer ATC code and drug repositioning.

Conclusion: The proposed two-tiered unsupervised clustering approach is suitable for drug clustering and enables heterogeneous data integration. It also enables identifying reliable repositioning drug candidates with reference to ATC therapeutic classification. The repositioning drug candidates identified consistently by multiple clustering algorithms and with high confidence have a higher possibility of being effective repositioning candidates.

Keywords: ATC classification; Data integration; Drug clustering; Drug repurposing; Heterogeneity.

Conflict of interest statement

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Fig. 1
Fig. 1
A generalized illustration of two alternative approaches involving in drug repositioning; (a), (b) and (c) represent the known interactions, New Target Recognition and New Indication Recognition, respectively. (The notations 1*-1* and m-n indicate one-or-many and many-to-many relationships, respectively)
Fig. 2
Fig. 2
The proposed approach
Fig. 3
Fig. 3
Drug-feature associations could capture in a bipartite graph as shown on (a) and its corresponding adjacency matrix is shown on (b). D(1,2,3) denotes the drugs while F(1,2,3,4) denotes the features such as chemical, disease, protein and side effect
Fig. 4
Fig. 4
a illustrates drug clusters while (b) illustrates its corresponding drug-drug associations. D(1,2,3) and C(1,2) denote the drugs and the clusters, respectively

Similar articles

See all similar articles

Cited by 1 article


    1. Dudley JT, Deshpande T, Butte AJ. Exploiting drug–disease relationships for computational drug repositioning. Brief Bioinforma. 2011;12:013. doi: 10.1093/bib/bbr013. - DOI - PMC - PubMed
    1. Napolitano F, Zhao Y, Moreira VM, Tagliaferri R, Kere J, D’Amato M, Greco D. Drug repositioning: a machine-learning approach through data integration. J Cheminformatics. 2013;5:30. doi: 10.1186/1758-2946-5-30. - DOI - PMC - PubMed
    1. Li J, Zheng S, Chen B, Butte AJ, Swamidass SJ, Lu Z. A survey of current trends in computational drug repositioning. Brief Bioinforma. 2016;17(1):2–12. doi: 10.1093/bib/bbv020. - DOI - PMC - PubMed
    1. U Sahu N, S Kharkar P. Computational drug repositioning: A lateral approach to traditional drug discovery? Curr Top Med Chem. 2016;16(19):2069–77. doi: 10.2174/1568026616666160216153249. - DOI - PubMed
    1. Yamanishi Y, Kotera M, Kanehisa M, Goto S. Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework. Bioinformatics. 2010;26(12):246–54. doi: 10.1093/bioinformatics/btq176. - DOI - PMC - PubMed

Publication types