Predicting Treatment Relations with Semantic Patterns over Biomedical Knowledge Graphs

Min Intell Knowl Explor (2015). 2015 Dec;9468:586-596. doi: 10.1007/978-3-319-26832-3_55. Epub 2016 Jan 3.

Abstract

Identifying new potential treatment options (say, medications and procedures) for known medical conditions that cause human disease burden is a central task of biomedical research. Since all candidate drugs cannot be tested with animal and clinical trials, in vitro approaches are first attempted to identify promising candidates. Even before this step, due to recent advances, in silico or computational approaches are also being employed to identify viable treatment options. Generally, natural language processing (NLP) and machine learning are used to predict specific relations between any given pair of entities using the distant supervision approach. In this paper, we report preliminary results on predicting treatment relations between biomedical entities purely based on semantic patterns over biomedical knowledge graphs. As such, we refrain from explicitly using NLP, although the knowledge graphs themselves may be built from NLP extractions. Our intuition is fairly straightforward - entities that participate in a treatment relation may be connected using similar path patterns in biomedical knowledge graphs extracted from scientific literature. Using a dataset of treatment relation instances derived from the well known Unified Medical Language System (UMLS), we verify our intuition by employing graph path patterns from a well known knowledge graph as features in machine learned models. We achieve a high recall (92 %) but precision, however, decreases from 95% to an acceptable 71% as we go from uniform class distribution to a ten fold increase in negative instances. We also demonstrate models trained with patterns of length ≤ 3 result in statistically significant gains in F-score over those trained with patterns of length ≤ 2. Our results show the potential of exploiting knowledge graphs for relation extraction and we believe this is the first effort to employ graph patterns as features for identifying biomedical relations.