Evaluation of repositories for sharing individual-participant data from clinical studies

Trials. 2019 Mar 15;20(1):169. doi: 10.1186/s13063-019-3253-3.


Background: Data repositories have the potential to play an important role in the effective and safe sharing of individual-participant data (IPD) from clinical studies. We analysed the current landscape of data repositories to create a detailed description of available repositories and assess their suitability for hosting data from clinical studies, from the perspective of the clinical researcher.

Methods: We assessed repositories that enable storage, sharing, discoverability, re-use of the IPD and associated documents from clinical studies using a pre-defined set of 34 items and publicly available information from April to June 2018. For this purpose, we developed an indicator set to capture the maturity of the repositories' procedures and their suitability for the hosting of IPD. The indicators cover guidelines for data upload and data de-identification, data quality controls, contracts for upload and storage, flexibility of access, application of identifiers, availability of metadata, and long-term preservation.

Results: We analysed 25 repositories, from an initial set of 55 identified as possibly relevant. Half of the included repositories were generic, i.e. not limited to a specific disease or clinical area and 13 were launched in the last 8 years. The sample was extremely heterogeneous and included repositories developed by research funders, infrastructures, universities, and editors. All but three repositories do not apply a fee for uploading, storage or access to data. None of the repositories completely demonstrated all the items included in the indicator set, but three repositories (Dryad, Drum, EASY) met - fully or partially - all items. Flexibility of data-access modalities appears to be limited, being lacking in half of the repositories.

Conclusions: Our evaluation, though often hampered by the lack of sufficient information, can help researchers to find a suitable repository for their datasets. Some repositories are more mature because of their support for clinical dataset preparation, contractual agreements, metadata and identifiers, different modalities of access, and long-term preservation of data. Further work is now required to achieve a more robust and accurate system for evaluation, which in turn may encourage the sharing of clinical study data.

Trial registration: Study protocol available at https://zenodo.org/record/1438261#.W64kW9Egrcs .

Keywords: Clinical studies; Data repositories; Data sharing; Individual-participant data; Metadata.

Publication types

  • Review

MeSH terms

  • Access to Information*
  • Big Data*
  • Clinical Studies as Topic*
  • Data Collection / methods*
  • Data Mining / methods*
  • Databases, Factual*
  • Humans
  • Information Dissemination / methods*
  • Metadata