A patient-centric dataset of images and metadata for identifying melanomas using clinical context

Sci Data. 2021 Jan 28;8(1):34. doi: 10.1038/s41597-021-00815-z.


Prior skin image datasets have not addressed patient-level information obtained from multiple skin lesions from the same patient. Though artificial intelligence classification algorithms have achieved expert-level performance in controlled studies examining single images, in practice dermatologists base their judgment holistically from multiple lesions on the same patient. The 2020 SIIM-ISIC Melanoma Classification challenge dataset described herein was constructed to address this discrepancy between prior challenges and clinical practice, providing for each image in the dataset an identifier allowing lesions from the same patient to be mapped to one another. This patient-level contextual information is frequently used by clinicians to diagnose melanoma and is especially useful in ruling out false positives in patients with many atypical nevi. The dataset represents 2,056 patients (20.8% with at least one melanoma, 79.2% with zero melanomas) from three continents with an average of 16 lesions per patient, consisting of 33,126 dermoscopic images and 584 (1.8%) histopathologically confirmed melanomas compared with benign melanoma mimickers.

Publication types

  • Dataset
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence
  • Humans
  • Melanoma* / diagnostic imaging
  • Melanoma* / pathology
  • Melanoma* / physiopathology
  • Metadata
  • Skin / pathology
  • Skin Neoplasms* / diagnostic imaging
  • Skin Neoplasms* / pathology
  • Skin Neoplasms* / physiopathology