A prediction model based on random survival forest analysis of the overall survival of elderly female papillary thyroid carcinoma patients: a SEER-based study

Endocrine. 2024 Apr 1. doi: 10.1007/s12020-024-03797-1. Online ahead of print.

Abstract

Objective: Papillary thyroid carcinoma (PTC) is a common malignancy whose incidence is three times greater in females than in males. The prognosis of ageing patients is poor. This research was designed to construct models to predict the overall survival of elderly female patients with PTC.

Methods: We developed prediction models based on the random survival forest (RSF) algorithm and traditional Cox regression. The data of 4539 patients were extracted from the Surveillance, Epidemiology, and End Results (SEER) database. Twelve variables were analysed to establish the models. The C-index and the Brier score were selected to evaluate the discriminatory ability of the models. Time-dependent receiver operating characteristic (ROC) curves were also drawn to evaluate the accuracy of the models. The clinical benefits of the two models were compared on the basis of the DCA curve. In addition, the Shapley Additive Explanations (SHAP) plot was used to visualize the contribution of the variables in the RSF model.

Results: The C-index of the RSF model was 0.811, which was greater than that of the Cox model (0.781). According to the Brier score and the area under the ROC curve (AUC), the RSF model performed better than the Cox model. On the basis of the DCA curve, the RSF model demonstrated fair clinical benefit. The SHAP plot showed that age was the most important variable contributing to the outcome of PTC in elderly female patients.

Conclusions: The RSF model we developed performed better than the Cox model and might be valuable for clinical practice.

Keywords: Epidemiology; Papillary thyroid carcinoma; Surveillance; and End Results Program; elderly female patients; machine learning; random survival forest; visualization.