From biobank and data silos into a data commons: convergence to support translational medicine

J Transl Med. 2021 Dec 4;19(1):493. doi: 10.1186/s12967-021-03147-z.

Abstract

Background: To drive translational medicine, modern day biobanks need to integrate with other sources of data (clinical, genomics) to support novel data-intensive research. Currently, vast amounts of research and clinical data remain in silos, held and managed by individual researchers, operating under different standards and governance structures; a framework that impedes sharing and effective use of data. In this article, we describe the journey of British Columbia's Gynecological Cancer Research Program (OVCARE) in moving a traditional tumour biobank, outcomes unit, and a collection of data silos, into an integrated data commons to support data standardization and resource sharing under collaborative governance, as a means of providing the gynecologic cancer research community in British Columbia access to tissue samples and associated clinical and molecular data from thousands of patients.

Results: Through several engagements with stakeholders from various research institutions within our research community, we identified priorities and assessed infrastructure needs required to optimize and support data collections, storage and sharing, under three main research domains: (1) biospecimen collections, (2) molecular and genomics data, and (3) clinical data. We further built a governance model and a resource portal to implement protocols and standard operating procedures for seamless collections, management and governance of interoperable data, making genomic, and clinical data available to the broader research community.

Conclusions: Proper infrastructures for data collection, sharing and governance is a translational research imperative. We have consolidated our data holdings into a data commons, along with standardized operating procedures to meet research and ethics requirements of the gynecologic cancer community in British Columbia. The developed infrastructure brings together, diverse data, computing frameworks, as well as tools and applications for managing, analyzing, and sharing data. Our data commons bridges data access gaps and barriers to precision medicine and approaches for diagnostics, treatment and prevention of gynecological cancers, by providing access to large datasets required for data-intensive science.

Keywords: Biobank-technologies; Biobanks; Biospecimens; Data commons; Data governance; Federated systems; Laboratory Information Management Systems (LIMS); Precision medicine.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biological Specimen Banks*
  • Female
  • Genome
  • Genomics
  • Humans
  • Translational Research, Biomedical
  • Translational Science, Biomedical*