Large scale crowdsourced radiotherapy segmentations across a variety of cancer anatomic sites

Sci Data. 2023 Mar 22;10(1):161. doi: 10.1038/s41597-023-02062-w.


Clinician generated segmentation of tumor and healthy tissue regions of interest (ROIs) on medical images is crucial for radiotherapy. However, interobserver segmentation variability has long been considered a significant detriment to the implementation of high-quality and consistent radiotherapy dose delivery. This has prompted the increasing development of automated segmentation approaches. However, extant segmentation datasets typically only provide segmentations generated by a limited number of annotators with varying, and often unspecified, levels of expertise. In this data descriptor, numerous clinician annotators manually generated segmentations for ROIs on computed tomography images across a variety of cancer sites (breast, sarcoma, head and neck, gynecologic, gastrointestinal; one patient per cancer site) for the Contouring Collaborative for Consensus in Radiation Oncology challenge. In total, over 200 annotators (experts and non-experts) contributed using a standardized annotation platform (ProKnow). Subsequently, we converted Digital Imaging and Communications in Medicine data into Neuroimaging Informatics Technology Initiative format with standardized nomenclature for ease of use. In addition, we generated consensus segmentations for experts and non-experts using the Simultaneous Truth and Performance Level Estimation method. These standardized, structured, and easily accessible data are a valuable resource for systematically studying variability in segmentation applications.

Publication types

  • Dataset

MeSH terms

  • Crowdsourcing*
  • Female
  • Humans
  • Image Processing, Computer-Assisted / methods
  • Neoplasms* / diagnostic imaging
  • Neoplasms* / radiotherapy
  • Radiation Oncology*
  • Radiotherapy Planning, Computer-Assisted / methods
  • Tomography, X-Ray Computed