Learning Drug-Disease-Target Embedding (DDTE) from knowledge graphs to inform drug repurposing hypotheses

Changsung Moon; Chunming Jin; Xialan Dong; Saad Abrar; Weifan Zheng; Rada Y Chirkova; Alexander Tropsha

doi:10.1016/j.jbi.2021.103838

Learning Drug-Disease-Target Embedding (DDTE) from knowledge graphs to inform drug repurposing hypotheses

J Biomed Inform. 2021 Jul:119:103838. doi: 10.1016/j.jbi.2021.103838. Epub 2021 Jun 11.

Authors

Changsung Moon¹, Chunming Jin², Xialan Dong², Saad Abrar¹, Weifan Zheng³, Rada Y Chirkova⁴, Alexander Tropsha⁵

Affiliations

¹ Department of Computer Science, North Carolina State University, Raleigh, NC 27695, USA.
² BRITE Institute and Department of Pharmaceutical Sciences, College of Health and Sciences, North Carolina Central University, Durham, NC 27707, USA.
³ BRITE Institute and Department of Pharmaceutical Sciences, College of Health and Sciences, North Carolina Central University, Durham, NC 27707, USA; UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, NC 27599, USA. Electronic address: wzheng@nccu.edu.
⁴ Department of Computer Science, North Carolina State University, Raleigh, NC 27695, USA. Electronic address: rychirko@ncsu.edu.
⁵ UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, NC 27599, USA. Electronic address: alex_tropsha@unc.edu.

PMID: 34119691
DOI: 10.1016/j.jbi.2021.103838

Abstract

We aimed to develop and validate a new graph embedding algorithm for embedding drug-disease-target networks to generate novel drug repurposing hypotheses. Our model denotes drugs, diseases and targets as subjects, predicates and objects, respectively. Each entity is represented by a multidimensional vector and the predicate is regarded as a translation vector from a subject to an object vectors. These vectors are optimized so that when a subject-predicate-object triple represents a known drug-disease-target relationship, the summed vector between the subject and the predicate is to be close to that of the object; otherwise, the summed vector is distant from the object. The DTINet dataset was utilized to test this algorithm and discover unknown links between drugs and diseases. In cross-validation experiments, this new algorithm outperformed the original DTINet model. The MRR (Mean Reciprocal Rank) values of our models were around 0.80 while those of the original model were about 0.70. In addition, we have identified and verified several pairs of new therapeutic relations as well as adverse effect relations that were not recorded in the original DTINet dataset. This approach showed excellent performance, and the predicted drug-disease and drug-side-effect relationships were found to be consistent with literature reports. This novel method can be used to analyze diverse types of emerging biomedical and healthcare-related knowledge graphs (KG).

Keywords: Data mining; Drug repurposing; Graph embedding; Knowledge graph.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Drug Repositioning*
Humans
Knowledge
Pattern Recognition, Automated
Pharmaceutical Preparations*

Substances

Pharmaceutical Preparations

Grants and funding

U01 CA207160/CA/NCI NIH HHS/United States