A deep learning-based framework for lung cancer survival analysis with biomarker interpretation

BMC Bioinformatics. 2020 Mar 18;21(1):112. doi: 10.1186/s12859-020-3431-z.

Abstract

Background: Lung cancer is the leading cause of cancer-related deaths in both men and women in the United States, and it has a much lower five-year survival rate than many other cancers. Accurate survival analysis is urgently needed for better disease diagnosis and treatment management.

Results: In this work, we propose a survival analysis system that takes advantage of recently emerging deep learning techniques. The proposed system consists of three major components. 1) The first component is an end-to-end cellular feature learning module using a deep neural network with global average pooling. The learned cellular representations encode high-level biologically relevant information without requiring individual cell segmentation, which is aggregated into patient-level feature vectors by using a locality-constrained linear coding (LLC)-based bag of words (BoW) encoding algorithm. 2) The second component is a Cox proportional hazards model with an elastic net penalty for robust feature selection and survival analysis. 3) The third commponent is a biomarker interpretation module that can help localize the image regions that contribute to the survival model's decision. Extensive experiments show that the proposed survival model has excellent predictive power for a public (i.e., The Cancer Genome Atlas) lung cancer dataset in terms of two commonly used metrics: log-rank test (p-value) of the Kaplan-Meier estimate and concordance index (c-index).

Conclusions: In this work, we have proposed a segmentation-free survival analysis system that takes advantage of the recently emerging deep learning framework and well-studied survival analysis methods such as the Cox proportional hazards model. In addition, we provide an approach to visualize the discovered biomarkers, which can serve as concrete evidence supporting the survival model's decision.

Keywords: Cell detection; Deep learning; Feature learning; Survival analysis.

Publication types

  • Evaluation Study

MeSH terms

  • Algorithms
  • Biomarkers / analysis*
  • Deep Learning
  • Female
  • Humans
  • Kaplan-Meier Estimate
  • Lung Neoplasms / genetics
  • Lung Neoplasms / mortality*
  • Male
  • Neural Networks, Computer
  • Proportional Hazards Models
  • Survival Analysis*

Substances

  • Biomarkers