Representing and querying disease networks using graph databases

BioData Min. 2016 Jul 25;9:23. doi: 10.1186/s13040-016-0102-8. eCollection 2016.

Abstract

Background: Systems biology experiments generate large volumes of data of multiple modalities and this information presents a challenge for integration due to a mix of complexity together with rich semantics. Here, we describe how graph databases provide a powerful framework for storage, querying and envisioning of biological data.

Results: We show how graph databases are well suited for the representation of biological information, which is typically highly connected, semi-structured and unpredictable. We outline an application case that uses the Neo4j graph database for building and querying a prototype network to provide biological context to asthma related genes.

Conclusions: Our study suggests that graph databases provide a flexible solution for the integration of multiple types of biological data and facilitate exploratory data mining to support hypothesis generation.

Keywords: Computational approach; Disease management platform; Graph database; Neo4j graph; Protein-centric framework; Systems medicine.

Publication types

  • Review