A large, curated, open-source stroke neuroimaging dataset to improve lesion segmentation algorithms

Sci Data. 2022 Jun 16;9(1):320. doi: 10.1038/s41597-022-01401-7.


Accurate lesion segmentation is critical in stroke rehabilitation research for the quantification of lesion burden and accurate image processing. Current automated lesion segmentation methods for T1-weighted (T1w) MRIs, commonly used in stroke research, lack accuracy and reliability. Manual segmentation remains the gold standard, but it is time-consuming, subjective, and requires neuroanatomical expertise. We previously released an open-source dataset of stroke T1w MRIs and manually-segmented lesion masks (ATLAS v1.2, N = 304) to encourage the development of better algorithms. However, many methods developed with ATLAS v1.2 report low accuracy, are not publicly accessible or are improperly validated, limiting their utility to the field. Here we present ATLAS v2.0 (N = 1271), a larger dataset of T1w MRIs and manually segmented lesion masks that includes training (n = 655), test (hidden masks, n = 300), and generalizability (hidden MRIs and masks, n = 316) datasets. Algorithm development using this larger sample should lead to more robust solutions; the hidden datasets allow for unbiased performance evaluation via segmentation challenges. We anticipate that ATLAS v2.0 will lead to improved algorithms, facilitating large-scale stroke research.

Publication types

  • Dataset

MeSH terms

  • Algorithms
  • Brain* / diagnostic imaging
  • Brain* / pathology
  • Humans
  • Image Processing, Computer-Assisted
  • Magnetic Resonance Imaging
  • Neuroimaging
  • Stroke* / diagnostic imaging
  • Stroke* / pathology