Improving Interpretability in Machine Diagnosis: Detection of Geographic Atrophy in OCT Scans

Ophthalmol Sci. 2021 Jul 13;1(3):100038. doi: 10.1016/j.xops.2021.100038. eCollection 2021 Sep.

Abstract

Purpose: Manually identifying geographic atrophy (GA) presence and location on OCT volume scans can be challenging and time consuming. This study developed a deep learning model simultaneously (1) to perform automated detection of GA presence or absence from OCT volume scans and (2) to provide interpretability by demonstrating which regions of which B-scans show GA.

Design: Med-XAI-Net, an interpretable deep learning model was developed to detect GA presence or absence from OCT volume scans using only volume scan labels, as well as to interpret the most relevant B-scans and B-scan regions.

Participants: One thousand two hundred eighty-four OCT volume scans (each containing 100 B-scans) from 311 participants, including 321 volumes with GA and 963 volumes without GA.

Methods: Med-XAI-Net simulates the human diagnostic process by using a region-attention module to locate the most relevant region in each B-scan, followed by an image-attention module to select the most relevant B-scans for classifying GA presence or absence in each OCT volume scan. Med-XAI-Net was trained and tested (80% and 20% participants, respectively) using gold standard volume scan labels from human expert graders.

Main outcome measures: Accuracy, area under the receiver operating characteristic (ROC) curve, F1 score, sensitivity, and specificity.

Results: In the detection of GA presence or absence, Med-XAI-Net obtained superior performance (91.5%, 93.5%, 82.3%, 82.8%, and 94.6% on accuracy, area under the ROC curve, F1 score, sensitivity, and specificity, respectively) to that of 2 other state-of-the-art deep learning methods. The performance of ophthalmologists grading only the 5 B-scans selected by Med-XAI-Net as most relevant (95.7%, 95.4%, 91.2%, and 100%, respectively) was almost identical to that of ophthalmologists grading all volume scans (96.0%, 95.7%, 91.8%, and 100%, respectively). Even grading only 1 region in 1 B-scan, the ophthalmologists demonstrated moderately high performance (89.0%, 87.4%, 77.6%, and 100%, respectively).

Conclusions: Despite using ground truth labels during training at the volume scan level only, Med-XAI-Net was effective in locating GA in B-scans and selecting relevant B-scans within each volume scan for GA diagnosis. These results illustrate the strengths of Med-XAI-Net in interpreting which regions and B-scans contribute to GA detection in the volume scan.

Keywords: AMD, age-related macular degeneration; AREDS2, Age-Related Eye Disease Study 2; AUC, area under curve; CAM, class activation mapping; CFP, color fundus photograph; CNN, convolutional neural network; Deep learning; GA detection; GA, geographic atrophy; Grad-CAM, gradient-weighted class activation mapping; I3D, Inflated 3D Convnet; Interpretable; OCT; PR, precision-recall; PR-AUC, area under PR curve; ROC, receiver operating characteristic; RPE, retinal pigment epithelium; SD, spectral-domain; XAI, explainable artificial intelligence; cRORA, complete retinal pigment epithelium and outer retinal atrophy.