Objectives: To develop and evaluate a convolutional neural network (CNN)-based model for recognising surgical phases in robot-assisted laparoscopic radical prostatectomy (RARP), with an emphasis on model interpretability and cross-platform validation.
Methods: A CNN using EfficientNet B7 was trained on video data from 75 RARP cases with the hinotori robotic system. Seven phases were annotated: bladder drop, prostate preparation, bladder neck dissection, seminal vesicle dissection, posterior dissection, apical dissection, and vesicourethral anastomosis. A total of 808 774 video frames were extracted at 1 frame/s for training and testing. Validation was performed on 25 RARP cases using the da Vinci robotic system to assess cross-platform generalisability. Gradient-weighted class activation mapping was used to enhance interpretability by identifying key regions of interest for phase classification.
Results: The CNN achieved 0.90 accuracy on the hinotori test set but dropped to 0.64 on the da Vinci dataset, thus indicating cross-platform limitations. Phase-specific F1 scores ranged from 0.77 to 0.97, with lower performance in the phase of seminal vesicle dissection, and apical dissection. Gradient-weighted class activation mapping visualisations revealed the model's focus on central pelvic structures rather than transient instruments, enhancing interpretability and insights into phase classification.
Conclusions: The model demonstrated high accuracy on a single robotic platform but requires further refinement for consistent cross-platform performance. Interpretability techniques will foster clinical trust and integration into workflows, advancing robotic surgery applications.
Keywords: artificial intelligence; convolutional neural network; deep learning; phase recognition; prostatectomy; robotic surgical procedures.
© 2025 BJU International.