Background and objectives: Deep learning-convolutional neural networks (DL-CNNs) have demonstrated high diagnostic accuracy within the domain of dermoscopy. However, many clinical settings lack dermoscopic devices, requiring reliance on close-up images. This study evaluates the robustness of a DL-CNN trained on dermoscopic images when challenged with close-up images.
Methods: In this cross-sectional study, the DL-CNN Moleanalyzer pro, trained on 129,487 dermoscopic images, was tested on 350 skin lesions, each imaged both clinically and dermoscopically. Histopathology (89.4%) or expert consensus with a two-year follow-up (10.6%) served as the reference standard. Primary outcomes included sensitivity, specificity, and the receiver operating characteristic-area under the curve (ROC-AUC).
Results: For dermoscopic images, the DL-CNN achieved a sensitivity of 88.2% (95% CI: 82.9%-92.0%), specificity of 69.0% (61.4%-75.8%), and a ROC-AUC of 0.866 (0.860-0.873). For close-up images of the same lesions, sensitivity decreased to 60.5% (53.5%-67.1%, p < 0.001), while specificity increased to 79.4% (72.3%-85.0%, p = 0.027). The ROC-AUC for close-up images was 0.780 (0.772-0.790, p = 0.003).
Conclusions: Our findings highlight the diagnostic limitations of applying DL-CNNs trained on dermoscopic images to close-up images, with reduced sensitivity but increased specificity. The results emphasize the need to improve cross-domain adaptability. Fine-tuning of DL-CNNs on clinical images may enhance real-world diagnostic accuracy.
Keywords: Clinical images; DL‐CNN; dermoscopy; robustness; teledermatology.
© 2025 The Author(s). Journal der Deutschen Dermatologischen Gesellschaft published by John Wiley & Sons Ltd on behalf of Deutsche Dermatologische Gesellschaft.