Integrating additional factors into the TNM staging for cutaneous melanoma by machine learning

PLoS One. 2021 Sep 30;16(9):e0257949. doi: 10.1371/journal.pone.0257949. eCollection 2021.

Abstract

Background: Integrating additional factors into the TNM staging system is needed for more accurate risk classification and survival prediction for patients with cutaneous melanoma. In the present study, we introduce machine learning as a novel tool that incorporates additional prognostic factors to improve the current TNM staging system.

Methods and findings: Cancer-specific survival data for cutaneous melanoma with at least a 5 years follow-up were extracted from the Surveillance, Epidemiology, and End Results Program of the National Cancer Institute and split into the training set (40,781 cases) and validation set (5,390 cases). Five factors were studied: the primary tumor (T), regional lymph nodes (N), distant metastasis (M), age (A), and sex (S). The Ensemble Algorithm for Clustering Cancer Data (EACCD) was applied to the training set to generate prognostic groups. Utilizing only T, N, and M, a basic prognostic system was built where patients were stratified into 10 prognostic groups with well-separated survival curves, similar to 10 AJCC stages. These 10 groups had a significantly higher accuracy in survival prediction than 10 stages (C-index = 0.7682 vs 0.7643; increase in C-index = 0.0039, 95% CI = (0.0032, 0.0047); p-value = 7.2×10-23). Nevertheless, a positive association remained between the EACCD grouping and the AJCC staging (Spearman's rank correlation coefficient = 0.8316; p-value = 4.5×10-13). With additional information from A and S, a more advanced prognostic system was established using the training data that stratified patients into 10 groups and further improved the prediction accuracy (C-index = 0.7865 vs 0.7643; increase in C-index = 0.0222, 95% CI = (0.0191, 0.0254); p-value = 8.8×10-43). Both internal validation using the training set and temporal validation using the validation set showed good stratification and a high predictive accuracy of the prognostic systems.

Conclusions: The EACCD allows additional factors to be integrated into the TNM to create a prognostic system that improves patient stratification and survival prediction for cutaneous melanoma. This integration separates favorable from unfavorable clinical outcomes for patients and improves both cohort selection for clinical trials and treatment management.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cohort Studies
  • Female
  • Humans
  • Machine Learning
  • Male
  • Melanoma / mortality*
  • Melanoma / pathology*
  • Melanoma, Cutaneous Malignant
  • Neoplasm Staging
  • Prognosis
  • SEER Program
  • Sensitivity and Specificity
  • Skin Neoplasms / mortality*
  • Skin Neoplasms / pathology*
  • Survival Analysis

Grants and funding

This work was partially supported by grants "Using Dendrograms to Create Prognostic Systems for Cancer" and "Creating Prognostic Systems for Cancer" sponsored by John P. Murtha Cancer Center Research Program and grant “Four Diamonds Fund from Penn State University” sponsored by Penn State University.