Lung cancer survival period prediction and understanding: Deep learning approaches

Int J Med Inform. 2021 Apr:148:104371. doi: 10.1016/j.ijmedinf.2020.104371. Epub 2020 Dec 29.

Abstract

Introduction: Survival period prediction through early diagnosis of cancer has many benefits. It allows both patients and caregivers to plan resources, time and intensity of care to provide the best possible treatment path for the patients. In this paper, by focusing on lung cancer patients, we build several survival prediction models using deep learning techniques to tackle both cancer survival classification and regression problems. We also conduct feature importance analysis to understand how lung cancer patients' relevant factors impact their survival periods. We contribute to identifying an approach to estimate survivability that are commonly and practically appropriate for medical use.

Methodologies: We have compared the performance across three of the most popular deep learning architectures - Artificial Neural Networks (ANN), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN) while comparing the performing of deep learning models against traditional machine learning models. The data was obtained from the lung cancer section of Surveillance, Epidemiology, and End Results (SEER) cancer registry.

Results: The deep learning models outperformed traditional machine learning models across both classification and regression approaches. We obtained a best of 71.18 % accuracy for the classification approach when patients' survival periods are segmented into classes of '<=6 months',' 0.5 - 2 years' and '>2 years' and Root Mean Squared Error (RMSE) of 13.5 % andR2 value of 0.5 for the regression approach for the deep learning models while the traditional machine learning models saturated at 61.12 % classification accuracy and 14.87 % RMSE in regression.

Conclusions: This approach can be a baseline for early prediction with predictions that can be further improved with more temporal treatment information collected from treated patients. In addition, we evaluated the feature importance to investigate the model interpretability, gaining further insight into the survival analysis models and the factors that are important in cancer survival period prediction.

Keywords: Deep learning; Feature importance; Lung cancer; SEER cancer registry; Survival period prediction.

MeSH terms

  • Deep Learning*
  • Humans
  • Lung Neoplasms*
  • Machine Learning
  • Neural Networks, Computer
  • Registries