Introduction and hypothesis: To evaluate the interobserver reliability of diagnosing levator avulsions between observers from different centers using tomographic ultrasound imaging (TUI) in women after their first delivery.
Methods: Transperineal ultrasound volume datasets of 40 women 6 months after their first delivery were analyzed by five observers from four different centers. Levator avulsions were diagnosed using TUI and datasets were assessed as optimal or suboptimal image quality and optimal or suboptimal pelvic floor contraction. Cohen's kappa was used to evaluate the interobserver reliability of diagnosing levator avulsions for the total group, the group with optimal and suboptimal image quality, and the group with optimal and suboptimal pelvic floor contraction. Consensus on the presence or absence of avulsions was scored according to the number of observers who diagnosed an avulsion (0 = consensus on the absence of avulsion, 1-4 = avulsion diagnosed by 1 to 4 observers, 5 = consensus on the presence of avulsion).
Results: For the total group, the interobserver reliability varied widely, with kappa values ranging from -0.07 to 0.72. Analyzes in the subgroups showed comparable results. Of the women who potentially have an avulsion (avulsion diagnosed by at least one observer), consensus on the presence of an avulsion was reached in 0.0 to 20.0 %. Of the women who potentially have no avulsion (no avulsion diagnosed by at least one observer), consensus on the absence of an avulsion was reached in 46.7 to 85.7 %.
Conclusions: Diagnosing levator avulsions using TUI in women 6 months after their first delivery is strongly observer-dependent and therefore not generalizable.