Background: Computer vision may aid in melanoma detection.
Objective: We sought to compare melanoma diagnostic accuracy of computer algorithms to dermatologists using dermoscopic images.
Methods: We conducted a cross-sectional study using 100 randomly selected dermoscopic images (50 melanomas, 44 nevi, and 6 lentigines) from an international computer vision melanoma challenge dataset (n = 379), along with individual algorithm results from 25 teams. We used 5 methods (nonlearned and machine learning) to combine individual automated predictions into "fusion" algorithms. In a companion study, 8 dermatologists classified the lesions in the 100 images as either benign or malignant.
Results: The average sensitivity and specificity of dermatologists in classification was 82% and 59%. At 82% sensitivity, dermatologist specificity was similar to the top challenge algorithm (59% vs. 62%, P = .68) but lower than the best-performing fusion algorithm (59% vs. 76%, P = .02). Receiver operating characteristic area of the top fusion algorithm was greater than the mean receiver operating characteristic area of dermatologists (0.86 vs. 0.71, P = .001).
Limitations: The dataset lacked the full spectrum of skin lesions encountered in clinical practice, particularly banal lesions. Readers and algorithms were not provided clinical data (eg, age or lesion history/symptoms). Results obtained using our study design cannot be extrapolated to clinical practice.
Conclusion: Deep learning computer vision systems classified melanoma dermoscopy images with accuracy that exceeded some but not all dermatologists.
Keywords: International Skin Imaging Collaboration; International Symposium on Biomedical Imaging; computer algorithm; computer vision; dermatologist; machine learning; melanoma; reader study; skin cancer.
Copyright © 2017 American Academy of Dermatology, Inc. Published by Elsevier Inc. All rights reserved.