Predictive mutation signature of immunotherapy benefits in NSCLC based on machine learning algorithms

Front Immunol. 2022 Sep 27;13:989275. doi: 10.3389/fimmu.2022.989275. eCollection 2022.


Background: Developing prediction tools for immunotherapy approaches is a clinically important and rapidly emerging field. The routinely used prediction biomarker is inaccurate and may not adequately utilize large amounts of medical data. Machine learning is a promising way to predict the benefit of immunotherapy from individual data by individuating the most important features from genomic data and clinical characteristics.

Methods: Machine learning was applied to identify a list of candidate genes that may predict immunotherapy benefits using data from the published cohort of 853 patients with NSCLC. We used XGBoost to capture nonlinear relations among many mutation genes and ICI benefits. The value of the derived machine learning-based mutation signature (ML-signature) on immunotherapy efficacy was evaluated and compared with the tumor mutational burden (TMB) and other clinical characteristics. The predictive power of ML-signature was also evaluated in independent cohorts of patients with NSCLC treated with ICI.

Results: We constructed the ML-signature based on 429 (training/validation = 8/2) patients who received immunotherapy and extracted 88 eligible predictive genes. Additionally, we conducted internal and external validation with the utility of the OAK+POPLAR dataset and independent cohorts, respectively. This ML-signature showed the enrichment in immune-related signaling pathways and compared to TMB, ML-signature was equipped with favorable predictive value and stratification.

Conclusion: Previous studies proposed no predictive difference between original TMB and modified TMB, and original TMB contains some genes with no predictive value. To demonstrate that fewer genetic tests are sufficient to predict immunotherapy efficacy, we used machine learning to screen out gene panels, which are used to calculate TMB. Therefore, we obtained the 88-gene panel, which showed the favorable prediction performance and stratification effect compared to the original TMB.

Keywords: gene; immunotherapy; machine learning (ML); non-small cell lung cancer (NSCLC); tumor mutational burden (TMB).

MeSH terms

  • Algorithms
  • Biomarkers, Tumor / genetics
  • Carcinoma, Non-Small-Cell Lung* / drug therapy
  • Carcinoma, Non-Small-Cell Lung* / therapy
  • Humans
  • Immunotherapy
  • Lung Neoplasms* / drug therapy
  • Lung Neoplasms* / therapy
  • Machine Learning
  • Mutation


  • Biomarkers, Tumor