Using machine learning to construct nomograms for patients with metastatic colon cancer

Colorectal Dis. 2020 Aug;22(8):914-922. doi: 10.1111/codi.14991. Epub 2020 Feb 16.

Abstract

Aim: Patients with synchronous colon cancer metastases have highly variable overall survival (OS), making accurate predictive models challenging to build. We aim to use machine learning to more accurately predict OS in these patients and to present this predictive model in the form of nomograms for patients and clinicians.

Methods: Using the National Cancer Database (2010-2014), we identified right colon (RC) and left colon (LC) cancer patients with synchronous metastases. Each primary site was split into training and testing datasets. Nomograms predicting 3- year OS were created for each site using Cox proportional hazard regression with lasso regression. Each model was evaluated by both calibration (comparison of predicted vs observed OS) and validation (degree of concordance as measured by the c-index) methodologies.

Results: A total of 11 018 RC and 8346 LC patients were used to construct and validate the nomograms. After stratifying each model into five risk groups, the predicted OS was within the 95% CI of the observed OS in four out of five risk groups for both the RC and LC models. Externally validated c-indexes at 3 years for the RC and LC models were 0.794 and 0.761, respectively.

Conclusions: Utilization of machine learning can result in more accurate predictive models for patients with metastatic colon cancer. Nomograms built from these models can assist clinicians and patients in the shared decision-making process of their cancer care.

Keywords: NCDB; colon cancer; machine learning; metastasis; nomogram.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Colonic Neoplasms*
  • Humans
  • Machine Learning
  • Nomograms
  • Prognosis
  • Rectal Neoplasms*