Assessing Quantitative Performance and Expert Review of Multiple Deep Learning-Based Frameworks for Computed Tomography-based Abdominal Organ Auto-Segmentation

Intell Oncol. 2025 Apr;1(2):160-171. doi: 10.1016/j.intonc.2025.03.003. Epub 2025 Mar 19.

Abstract

Background: Segmentation of abdominal organs in computed tomography (CT) images within clinical oncological workflows is crucial for ensuring effective treatment planning and follow-up. However, manually generated segmentations are time-consuming and labor-intensive in addition to being subject to inter-observer variability. Many deep learning (DL) and Automated Machine Learning (AutoML) frameworks have emerged as a solution to this challenge and show promise in clinical workflows.

Objective: This study presents a comprehensive evaluation of existing AutoML frameworks (Auto3DSeg, nnU-Net) against a state-of-the-art non-AutoML framework, the Shifted Window U-Net Transformer (SwinUNETR).

Methods: Each framework was trained on the same 122 training images, taken from the Abdominal Multi-Organ Segmentation (AMOS) grand challenge. Frameworks were compared using Dice Similarity Coefficient (DSC), Surface DSC (sDSC), and 95th Percentile Hausdorff Distances (HD95) on an additional 72 holdout-validation images. The perceived clinical viability of 30 auto-contoured test cases was assessed by three physicians in a blinded evaluation.

Results: Comparisons show significantly better performance by AutoML methods. nnU-Net (average DSC: 0.924, average sDSC: 0.938, average HD95: 4.26, median Likert: 4.57), Auto3DSeg (average DSC: 0.902, average sDSC: 0.919, average HD95: 8.76, median Likert: 4.49), and SwinUNETR (average DSC: 0.837, average sDSC: 0.844, average HD95: 13.93). AutoML frameworks were quantitatively preferred (13/13 OARs P >0.0.5 in DSC and sDSC, 12/13 OARs P>0.05 in HD95, comparing Auto3DSeg to SwinUNETR, and all OARs P>0.05 in all metrics comparing SwinUNETR to nnU-Net). Qualitatively, nnU-Net was preferred over Auto3DSeg (P=0.0027).

Conclusion: The findings suggest that AutoML frameworks offer a significant advantage in the segmentation of abdominal organs, and underscores the potential of AutoML methods to enhance the efficiency of oncological workflows.

Keywords: Auto3DSeg; AutoML; Radiation Oncology; Segmentation; nnU-Net.