Establishment and Validation of a Machine Learning Prediction Model Based on Big Data for Predicting the Risk of Bone Metastasis in Renal Cell Carcinoma Patients

Comput Math Methods Med. 2022 Oct 3:2022:5676570. doi: 10.1155/2022/5676570. eCollection 2022.

Abstract

Purpose: Since the prognosis of renal cell carcinoma (RCC) patients with bone metastasis (BM) is poor, this study is aimed at using big data to build a machine learning (ML) model to predict the risk of BM in RCC patients.

Methods: A retrospective study was conducted on 40,355 RCC patients in the SEER database from 2010 to 2017. LASSO regression and multivariate logistic regression analysis was performed to determine independent risk factors of RCC-BM. Six ML algorithm models, including LR, GBM, XGB, RF, DT, and NBC, were used to establish risk models for predicting RCC-BM. The prediction performance of ML models was weighed by 10-fold cross-validation.

Results: The study investigated 40,355 patients diagnosed with RCC in the SEER database, where 1,811 (4.5%) were BM patients. Independent risk factors for BM were tumor grade, T stage, N stage, liver metastasis, lung metastasis, and brain metastasis. Among the RCC-BM risk prediction models established by six ML algorithms, the XGB model showed the best prediction performance (AUC = 0.891). Therefore, a network calculator based on the XGB model was established to individually assess the risk of BM in patients with RCC.

Conclusion: The XGB risk prediction model based on the ML algorithm performed a good prediction effect on BM in RCC patients.

MeSH terms

  • Big Data
  • Bone Neoplasms*
  • Carcinoma, Renal Cell* / pathology
  • Humans
  • Kidney Neoplasms* / pathology
  • Machine Learning
  • Neoplasm Metastasis* / pathology
  • Retrospective Studies
  • Risk Factors