BioDWH2: an automated graph-based data warehouse and mapping tool

J Integr Bioinform. 2021 Feb 22;18(2):167-176. doi: 10.1515/jib-2020-0033.

Abstract

Data integration plays a vital role in scientific research. In biomedical research, the OMICS fields have shown the need for larger datasets, like proteomics, pharmacogenomics, and newer fields like foodomics. As research projects require multiple data sources, mapping between these sources becomes necessary. Utilized workflow systems and integration tools therefore need to process large amounts of heterogeneous data formats, check for data source updates, and find suitable mapping methods to cross-reference entities from different databases. This article presents BioDWH2, an open-source, graph-based data warehouse and mapping tool, capable of helping researchers with these issues. A workspace centered approach allows project-specific data source selections and Neo4j or GraphQL server tools enable quick access to the database for analysis. The BioDWH2 tools are available to the scientific community at https://github.com/BioDWH2.

Keywords: data warehousing; database; graph database; pipeline; software tools.

MeSH terms

  • Data Warehousing*
  • Databases, Factual
  • Information Storage and Retrieval
  • Software*
  • Workflow