Concordance among experts in assessing apical mucosal preservation during holmium laser enucleation of the prostate (HoLEP): implications for artificial intelligence model development

World J Urol. 2025 Dec 4;44(1):20. doi: 10.1007/s00345-025-06118-x.

Abstract

Objective: To quantify interrater reliability among expert urologists in visually assessing apical mucosal preservation during holmium laser enucleation of the prostate (HoLEP) and to examine the association between preservation ratings and early postoperative continence, thereby informing the design of computer‑vision algorithms.

Methods: Sixty anonymized video segments from “en‑bloc” HoLEP procedures performed between June 2023 and May 2024 were independently reviewed by six HoLEP surgeons. Each rater classified mucosal integrity as completely preserved, partially preserved, or not preserved. Interrater agreement was quantified with pairwise Cohen’s κ and multi‑rater Fleiss κ. Predictive value for 6‑week continence was evaluated using logistic regression and receiver‑operating‑characteristic (ROC) analysis of consensus ratings.

Results: Pairwise Cohen’s κ ranged from 0.07 to 0.44; overall Fleiss κ was 0.18, indicating poor concordance. Agreement was highest between surgeons trained at the same institution (κ 0.44). The partially preserved class accounted for most disagreements. Majority‑vote preservation grade predicted continence poorly (ROC‑AUC 0.60). Observed patterns suggested that worse mucosal preservation tended to coincide with higher incontinence rates, but these trends were exploratory and not suitable for inferential interpretation.

Conclusion: Expert visual assessment of apical mucosal preservation lacks sufficient reliability to serve as ground truth for supervised computer‑vision training. Given the non-convergent regression model and exploratory nature of observed associations, mucosal-preservation ratings should not be used as inferential predictors. Standardized grading criteria or outcome‑based labels are needed to develop robust AI tools aimed at reducing transient stress urinary incontinence after HoLEP.

Supplementary Information: The online version contains supplementary material available at 10.1007/s00345-025-06118-x.

Keywords: Computer vision; Holmium; Lasers; Prostatic hyperplasia; Reproducibility of results; Stress; Surgical; Urinary incontinence.