Introduction: We aimed to investigate the intergrader and intragrader reliability of human graders and an automated algorithm for vertical cup-disc ratio (CDR) grading in colour fundus photographs.
Materials and methods: Two-hundred fundus photographs were selected from a database of 3000 photographs of patients screened at a tertiary ophthalmology referral centre. The graders included glaucoma specialists (n = 3), general ophthalmologists (n = 2), optometrists (n = 2), family physicians (n = 2) and a novel automated algorithm (AA). In total, 2 rounds of CDR grading were held for each grader on 2 different dates, with the photographs presented in random order. The CDR values were graded as 0.1-1.0 or ungradable. The grading results of the 2 senior glaucoma specialists were used as the reference benchmarks for comparison.
Results: The intraclass correlation coefficient values ranged from 0.37-0.74 and 0.47-0.97 for intergrader and intragrader reliability, respectively. There was no significant correlation between the human graders' level of reliability and their years of experience in grading CDR (P = 0.91). The area under the curve (AUC) value of the AA was 0.847 (comparable to AUC value of 0.876 for the glaucoma specialist). Bland Altman plots demonstrated that the AA's performance was at least comparable to a glaucoma specialist.
Conclusion: The results suggest that AA is comparable to and may have more consistent performance than human graders in CDR grading of fundus photographs. This may have potential application as a screening tool to help detect asymptomatic glaucoma-suspect patients in the community.