A reference set of curated biomedical data and metadata from clinical case reports

Sci Data. 2018 Nov 20:5:180258. doi: 10.1038/sdata.2018.258.


Clinical case reports (CCRs) provide an important means of sharing clinical experiences about atypical disease phenotypes and new therapies. However, published case reports contain largely unstructured and heterogeneous clinical data, posing a challenge to mining relevant information. Current indexing approaches generally concern document-level features and have not been specifically designed for CCRs. To address this disparity, we developed a standardized metadata template and identified text corresponding to medical concepts within 3,100 curated CCRs spanning 15 disease groups and more than 750 reports of rare diseases. We also prepared a subset of metadata on reports on selected mitochondrial diseases and assigned ICD-10 diagnostic codes to each. The resulting resource, Metadata Acquired from Clinical Case Reports (MACCRs), contains text associated with high-level clinical concepts, including demographics, disease presentation, treatments, and outcomes for each report. Our template and MACCR set render CCRs more findable, accessible, interoperable, and reusable (FAIR) while serving as valuable resources for key user groups, including researchers, physician investigators, clinicians, data scientists, and those shaping government policies for clinical trials.

Publication types

  • Dataset
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Clinical Studies as Topic*
  • Computational Biology
  • Data Analysis
  • Data Curation* / methods
  • Data Curation* / standards
  • Humans
  • Metadata* / standards

Associated data

  • Dryad/10.5061/dryad.r36cn90
  • figshare/10.6084/m9.figshare.c.4220324