Ensemble learning can effectively mitigate the risk of model overfitting during training. This study aims to evaluate the performance of ensemble learning models in predicting tumor deposits in rectal cancer (RC) and identify the optimal model for preoperative clinical decision-making. A total of 199 RC patients were analyzed, with radiomic features extracted from T2-weighted and apparent diffusion coefficient images and selected through advanced statistical methods. After that, the bagging-ensemble learning model (random forest), boosting-ensemble learning model (XGBoost, AdaBoost, LightGBM, and CatBoost), and voting-ensemble learning model (integrating 5 classifiers) were applied and optimized using grid search with tenfold cross-validation. The area under the receiver operator characteristic curve, calibration curve, t-distributed stochastic neighbor embedding (t-SNE), and decision curve analysis were adopted to evaluate the performance of each model. The voting-ensemble learning model (VELM) performs best in the testing cohort, with an AUC of 0.875 and an accuracy of 0.800. Notably, Calibration plots confirmed VELM's stability and t-SNE visualization illustrated clear clustering of radiomic features. Decision curve analysis further validated the VELM's superior net benefit across a range of clinical thresholds, underscoring its potential as a reliable tool for clinical decision-making in RC.
Keywords: Ensemble learning; Machine learning; Prediction; Radiomics; Rectal cancer.
© 2025. The Author(s).