Ensemble neural network model for detecting thyroid eye disease using external photographs

Br J Ophthalmol. 2023 Nov;107(11):1722-1729. doi: 10.1136/bjo-2022-321833. Epub 2022 Sep 8.

Abstract

Purpose: To describe an artificial intelligence platform that detects thyroid eye disease (TED).

Design: Development of a deep learning model.

Methods: 1944 photographs from a clinical database were used to train a deep learning model. 344 additional images ('test set') were used to calculate performance metrics. Receiver operating characteristic, precision-recall curves and heatmaps were generated. From the test set, 50 images were randomly selected ('survey set') and used to compare model performance with ophthalmologist performance. 222 images obtained from a separate clinical database were used to assess model recall and to quantitate model performance with respect to disease stage and grade.

Results: The model achieved test set accuracy of 89.2%, specificity 86.9%, recall 93.4%, precision 79.7% and an F1 score of 86.0%. Heatmaps demonstrated that the model identified pixels corresponding to clinical features of TED. On the survey set, the ensemble model achieved accuracy, specificity, recall, precision and F1 score of 86%, 84%, 89%, 77% and 82%, respectively. 27 ophthalmologists achieved mean performance of 75%, 82%, 63%, 72% and 66%, respectively. On the second test set, the model achieved recall of 91.9%, with higher recall for moderate to severe (98.2%, n=55) and active disease (98.3%, n=60), as compared with mild (86.8%, n=68) or stable disease (85.7%, n=63).

Conclusions: The deep learning classifier is a novel approach to identify TED and is a first step in the development of tools to improve diagnostic accuracy and lower barriers to specialist evaluation.

Keywords: Diagnostic tests/Investigation; Imaging; Orbit; Telemedicine.