The gold standard for identifying stroke lesions is manual tracing, a method that is known to be observer dependent and time consuming, thus impractical for big data studies. We propose LINDA (Lesion Identification with Neighborhood Data Analysis), an automated segmentation algorithm capable of learning the relationship between existing manual segmentations and a single T1-weighted MRI. A dataset of 60 left hemispheric chronic stroke patients is used to build the method and test it with k-fold and leave-one-out procedures. With respect to manual tracings, predicted lesion maps showed a mean dice overlap of 0.696 ± 0.16, Hausdorff distance of 17.9 ± 9.8 mm, and average displacement of 2.54 ± 1.38 mm. The manual and predicted lesion volumes correlated at r = 0.961. An additional dataset of 45 patients was utilized to test LINDA with independent data, achieving high accuracy rates and confirming its cross-institutional applicability. To investigate the cost of moving from manual tracings to automated segmentation, we performed comparative lesion-to-symptom mapping (LSM) on five behavioral scores. Predicted and manual lesions produced similar neuro-cognitive maps, albeit with some discussed discrepancies. Of note, region-wise LSM was more robust to the prediction error than voxel-wise LSM. Our results show that, while several limitations exist, our current results compete with or exceed the state-of-the-art, producing consistent predictions, very low failure rates, and transferable knowledge between labs. This work also establishes a new viewpoint on evaluating automated methods not only with segmentation accuracy but also with brain-behavior relationships. LINDA is made available online with trained models from over 100 patients.
Keywords: VLSM; automatic; hierarchical; machine learning; random forests; subacute.
© 2016 Wiley Periodicals, Inc.