A machine learning tool to improve prediction of mediastinal lymph node metastases in non-small cell lung cancer using routinely obtainable [18F]FDG-PET/CT parameters

Eur J Nucl Med Mol Imaging. 2023 Jun;50(7):2140-2151. doi: 10.1007/s00259-023-06145-z. Epub 2023 Feb 23.

Abstract

Background: In patients with non-small cell lung cancer (NSCLC), accuracy of [18F]FDG-PET/CT for pretherapeutic lymph node (LN) staging is limited by false positive findings. Our aim was to evaluate machine learning with routinely obtainable variables to improve accuracy over standard visual image assessment.

Methods: Monocentric retrospective analysis of pretherapeutic [18F]FDG-PET/CT in 491 consecutive patients with NSCLC using an analog PET/CT scanner (training + test cohort, n = 385) or digital scanner (validation, n = 106). Forty clinical variables, tumor characteristics, and image variables (e.g., primary tumor and LN SUVmax and size) were collected. Different combinations of machine learning methods for feature selection and classification of N0/1 vs. N2/3 disease were compared. Ten-fold nested cross-validation was used to derive the mean area under the ROC curve of the ten test folds ("test AUC") and AUC in the validation cohort. Reference standard was the final N stage from interdisciplinary consensus (histological results for N2/3 LNs in 96%).

Results: N2/3 disease was present in 190 patients (39%; training + test, 37%; validation, 46%; p = 0.09). A gradient boosting classifier (GBM) with 10 features was selected as the final model based on test AUC of 0.91 (95% confidence interval, 0.87-0.94). Validation AUC was 0.94 (0.89-0.98). At a target sensitivity of approx. 90%, test/validation accuracy of the GBM was 0.78/0.87. This was significantly higher than the accuracy based on "mediastinal LN uptake > mediastinum" (0.7/0.75; each p < 0.05) or combined PET/CT criteria (PET positive and/or LN short axis diameter > 10 mm; 0.68/0.75; each p < 0.001). Harmonization of PET images between the two scanners affected SUVmax and visual assessment of the LNs but did not diminish the AUC of the GBM.

Conclusions: A machine learning model based on routinely available variables from [18F]FDG-PET/CT improved accuracy in mediastinal LN staging compared to established visual assessment criteria. A web application implementing this model was made available.

Keywords: Artificial intelligence; FDG-PET/CT; Lung cancer; Lymph node staging; Machine learning; NSCLC.

Publication types

  • Comment

MeSH terms

  • Carcinoma, Non-Small-Cell Lung* / diagnostic imaging
  • Carcinoma, Non-Small-Cell Lung* / pathology
  • Fluorodeoxyglucose F18
  • Humans
  • Lung Neoplasms* / diagnostic imaging
  • Lung Neoplasms* / pathology
  • Lymph Nodes / pathology
  • Lymphatic Metastasis / diagnostic imaging
  • Lymphatic Metastasis / pathology
  • Mediastinum / diagnostic imaging
  • Neoplasm Staging
  • Positron Emission Tomography Computed Tomography / methods
  • Retrospective Studies

Substances

  • Fluorodeoxyglucose F18