Purpose: To develop and validate a novel artificial intelligence (AI)-powered video analysis system to assess surgeon proficiency in maintaining (1) eye neutrality, (2) eye centration, and (3) adequate focus of the operating microscope in cataract surgery and evaluate differences in these metrics between attending cataract surgeons and ophthalmology residents.
Design: A retrospective surgical video analysis.
Subjects: Six hundred twenty complete surgical video recordings of 620 cataract surgeries performed by either attending surgeons or ophthalmology residents.
Main outcome measures: Performance of the proposed AI-powered video analysis system (CatSkill) for cataract surgery was evaluated at multiple stages. Anatomy and surgical landmark segmentation were reported as Dice coefficients. The proposed cataract surgery assessment metrics (CSAMs) were compared between attending and resident surgeons on a phase-wise basis. Surgery-level classification performance (attending vs. resident) of a machine learning (ML) algorithm trained on the CSAMs was assessed using area under the receiver operating characteristic curve (AUC).
Methods: An automated system involving video preprocessing, deep learning-based segmentation with limbus obstruction detection and compensation, and CSAM computation was designed to assess surgeon performance based on surgical videos. Three CSAMs were computed to analyze 430 cataract surgeries (254 attendings and 176 residents). An ML algorithm was developed to predict surgeon training level using only CSAMs.
Results: The CatSkill system using FPN (VGG16) achieved a Dice coefficient of 94.03% for segmentation of palpebral fissure, limbus, and Purkinje image 1. The phase-wise mean CSAM scores were higher for attendings than residents across all surgical phases. Residents struggled with stability/centration during the Main Wound, Cortical Removal, Lens Insertion, and Wound Closure phases, and had difficulty maintaining adequate microscope focus during later phases of surgery. A random forest model using CSAMs achieved an AUC of 0.865 in predicting the skill level (attending or resident) of the surgeon.
Conclusions: The proposed AI-derived CSAMs provide a high level of reliability in assessing the ability of surgeons to maintain eye neutrality, centration, and focus level during cataract surgery. Furthermore, downstream analysis using an ML model for surgical-level classification indicates that the proposed CSAMs provide significant predictive value for assessing the overall training level of the surgeon.
Financial disclosures: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
Keywords: Cataract surgery; Deep learning; Segmentation; Surgery skill assessment; Video analysis.
© 2025 by the American Academy of Ophthalmologyé.