Objectives: This paper presents a methodology for recovering and decomposing Swanson's Raynaud Syndrome-Fish Oil hypothesis semi-automatically. The methodology leverages the semantics of assertions extracted from biomedical literature (called semantic predications) along with structured background knowledge and graph-based algorithms to semi-automatically capture the informative associations originally discovered manually by Swanson. Demonstrating that Swanson's manually intensive techniques can be undertaken semi-automatically, paves the way for fully automatic semantics-based hypothesis generation from scientific literature.
Methods: Semantic predications obtained from biomedical literature allow the construction of labeled directed graphs which contain various associations among concepts from the literature. By aggregating such associations into informative subgraphs, some of the relevant details originally articulated by Swanson have been uncovered. However, by leveraging background knowledge to bridge important knowledge gaps in the literature, a methodology for semi-automatically capturing the detailed associations originally explicated in natural language by Swanson, has been developed.
Results: Our methodology not only recovered the three associations commonly recognized as Swanson's hypothesis, but also decomposed them into an additional 16 detailed associations, formulated as chains of semantic predications. Altogether, 14 out of the 19 associations that can be attributed to Swanson were retrieved using our approach. To the best of our knowledge, such an in-depth recovery and decomposition of Swanson's hypothesis has never been attempted.
Conclusion: In this work therefore, we presented a methodology to semi-automatically recover and decompose Swanson's RS-DFO hypothesis using semantic representations and graph algorithms. Our methodology provides new insights into potential prerequisites for semantics-driven Literature-Based Discovery (LBD). Based on our observations, three critical aspects of LBD include: (1) the need for more expressive representations beyond Swanson's ABC model; (2) an ability to accurately extract semantic information from text; and (3) the semantic integration of scientific literature and structured background knowledge.
Published by Elsevier Inc.