Introduction: Sample collections and data are hosted within different biobanks at diverse institutions across Europe. Our data integration framework aims at incorporating data about sample collections from different biobanks into a common research infrastructure, facilitating researchers' abilities to obtain high-quality samples to conduct their research. The resulting information must be locally gathered and distributed to searchable higher level information biobank directories to maximize the visibility on the national and European levels. Therefore, biobanks and sample collections must be clearly described and unambiguously identified. We describe how to tackle the challenges of integrating biobank-related data between biobank directories using heterogeneous data schemas and different technical environments.
Methods: To establish a data exchange infrastructure between all biobank directories involved, we propose the following steps: (A) identification of core entities, terminology, and semantic relationships, (B) harmonization of heterogeneous data schemas of different Biobanking and Biomolecular Resources Research Infrastructure (BBMRI) directories, and (C) formulation of technical core principles for biobank data exchange between directories.
Results: (A) We identified the major core elements to describe biobanks in biobank directories. Since all directory data models were partially based on Minimum Information About BIobank Data Sharing (MIABIS) 2.0, the MIABIS 2.0 core model was used for compatibility. (B) Different projection scenarios were elaborated in collaboration with all BBMRI.at partners. A minimum set of mandatory and optional core entities and data items was defined for mapping across all directory levels. (C) Major core data exchange principles were formulated and data interfaces implemented by all biobank directories involved.
Discussion: We agreed on a MIABIS 2.0-based core set of harmonized biobank attributes and established a list of data exchange core principles for integrating biobank directories on different levels. This generic approach and the data exchange core principles proposed herein can also be applied in related tasks like integration and harmonization of biobank data on the individual sample and patient levels.
Keywords: BBMRI; EDI interface; MIABIS; RESTful; biobank directory; data integration.