Mining whole-lung information by artificial intelligence for predicting EGFR genotype and targeted therapy response in lung cancer: a multicohort study

Lancet Digit Health. 2022 May;4(5):e309-e319. doi: 10.1016/S2589-7500(22)00024-3. Epub 2022 Mar 24.


Background: Epidermal growth factor receptor (EGFR) genotype is crucial for treatment decision making in lung cancer, but it can be affected by tumour heterogeneity and invasive biopsy during gene sequencing. Importantly, not all patients with an EGFR mutation have good prognosis with EGFR-tyrosine kinase inhibitors (TKIs), indicating the necessity of stratifying for EGFR-mutant genotype. In this study, we proposed a fully automated artificial intelligence system (FAIS) that mines whole-lung information from CT images to predict EGFR genotype and prognosis with EGFR-TKI treatment.

Methods: We included 18 232 patients with lung cancer with CT imaging and EGFR gene sequencing from nine cohorts in China and the USA, including a prospective cohort in an Asian population (n=891) and The Cancer Imaging Archive cohort in a White population. These cohorts were divided into thick CT group and thin CT group. The FAIS was built for predicting EGFR genotype and progression-free survival of patients receiving EGFR-TKIs, and it was evaluated by area under the curve (AUC) and Kaplan-Meier analysis. We further built two tumour-based deep learning models as comparison with the FAIS, and we explored the value of combining FAIS and clinical factors (the FAIS-C model). Additionally, we included 891 patients with 56-panel next-generation sequencing and 87 patients with RNA sequencing data to explore the biological mechanisms of FAIS.

Findings: FAIS achieved AUCs ranging from 0·748 to 0·813 in the six retrospective and prospective testing cohorts, outperforming the commonly used tumour-based deep learning model. Genotype predicted by the FAIS-C model was significantly associated with prognosis to EGFR-TKIs treatment (log-rank p<0·05), an important complement to gene sequencing. Moreover, we found 29 prognostic deep learning features in FAIS that were able to identify patients with an EGFR mutation at high risk of TKI resistance. These features showed strong associations with multiple genotypes (p<0·05, t test or Wilcoxon test) and gene pathways linked to drug resistance and cancer progression mechanisms.

Interpretation: FAIS provides a non-invasive method to detect EGFR genotype and identify patients with an EGFR mutation at high risk of TKI resistance. The superior performance of FAIS over tumour-based deep learning methods suggests that genotype and prognostic information could be obtained from the whole lung instead of only tumour tissues.

Funding: National Natural Science Foundation of China.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence
  • Carcinoma, Non-Small-Cell Lung* / drug therapy
  • Carcinoma, Non-Small-Cell Lung* / genetics
  • Carcinoma, Non-Small-Cell Lung* / pathology
  • ErbB Receptors / genetics
  • ErbB Receptors / therapeutic use
  • Genes, erbB-1
  • Genotype
  • Humans
  • Lung / pathology
  • Lung Neoplasms* / drug therapy
  • Lung Neoplasms* / genetics
  • Lung Neoplasms* / pathology
  • Mutation
  • Prospective Studies
  • Protein Kinase Inhibitors / pharmacology
  • Protein Kinase Inhibitors / therapeutic use
  • Retrospective Studies


  • Protein Kinase Inhibitors
  • EGFR protein, human
  • ErbB Receptors