Machine learning-based overall and cancer-specific survival prediction of M0 penile squamous cell carcinoma:A population-based retrospective study

Heliyon. 2023 Dec 8;10(1):e23442. doi: 10.1016/j.heliyon.2023.e23442. eCollection 2024 Jan 15.

Abstract

Background: Penile cancer is a rare tumor and few studies have focused on the prognosis of M0 penile squamous cell carcinoma (PSCC). This retrospective study aimed to identify independent prognostic factors and construct predictive models for the overall survival (OS) and cancer-specific survival (CSS) of patients with M0 PSCC.

Methods: Data was extracted from the Surveillance, Epidemiology, and End Results database for patients diagnosed with malignant penile cancer. Eligible patients with M0 PSCC were selected according to predetermined inclusion and exclusion criteria. These patients were then divided into a training set, a validation set, and a test set. Univariate and multivariate COX regression analyses were initially performed to identify independent prognostic factors for OS and CSS in M0 PSCC patients. Subsequently, traditional and machine learning prognostic models, including random survival forest (RSF), COX, gradient boosting, and component-wise gradient boosting modelling, were constructed using the scikit-survival framework. The performance of each model was assessed by calculating time-dependent area under curve (AUC), C-index, and integrated Brier score (IBS), ultimately identifying the model with the highest performance. Finally, the Shapley additive explanation (SHAP) value, feature importance, and cumulative rates analyses were used to further estimate the selected model.

Results: A total of 2, 446 patients were included in our study. Cox regression analyses demonstrated that age, N stage, and tumor size were predictors of OS, while the N stage, tumor size, surgery, and residential area were predictors of CSS. The RSF and COX models had a higher time-independent AUC and C-index, and lower IBS value than other models in OS and CSS prediction. Feature importance analysis revealed the N stage as a common significant feature for predicting M0 PSCC patients' survival. The SHAP and cumulative rate analyses demonstrated that the selected models can effectively evaluate the prognosis of M0 PSCC patients.

Conclusion: In M0 PSCC patients, age, N stage, and tumor size were predictors of OS. In addition, the N stage, tumor size, surgery, and residential area were predictors of CSS. The machine learning-based RSF and COX models effectively predicted the prognosis of M0 PSCC patients.

Keywords: Machine learning; Penile cancer; Penile squamous cell carcinoma; Random survival forest; Survival.