Investigating the relationship between tongue diagnosis features and gastric cancer: A machine learning-based prediction model

Eur J Surg Oncol. 2025 Oct;51(10):110352. doi: 10.1016/j.ejso.2025.110352. Epub 2025 Jul 26.

Abstract

Background and objective: Gastric cancer (GC) remains a major cause of cancer-related mortality, particularly in China, where early detection is hindered by reliance on invasive, resource-intensive methods like gastroscopy. This study aimed to develop a non-invasive diagnostic tool integrating tongue features derived from traditional Chinese medicine (TCM) with machine learning (ML) to enhance early GC detection.

Methods: A prospective, propensity score-matched cohort of 292 participants (146 GC, 146 non-GC) was analyzed. Standardized protocols captured tongue features (color, morphology, coating) alongside gastroscopic findings. Seven ML algorithms, including GBDT, LightGBM, and XGBoost, were trained on multimodal clinical and imaging data. Feature selection was performed using LASSO regression, and model performance was evaluated through stratified 5-fold cross-validation and a 30 % independent test set. Association rule mining (FP-Growth) was employed to explore predictive tongue-gastroscopy patterns.

Results: GC patients exhibited more frequent bluish-purple (42 %), cracked (87 %), swollen (86 %), and prickly tongues (67 %), as well as grayish-black coatings (29 %). Non-GC individuals more often showed pale white tongues (40 %) and peeled coatings (71 %). FP-Growth identified combinations such as cracked tongue, grayish-black, and thin coatings as predictors of hemorrhagic GC (confidence = 88.89 %). LASSO highlighted key predictors including prickly tongue and CA19-9. Among the models, GBDT achieved the best performance (test AUC = 0.980, F1 = 0.932). SHAP analysis confirmed the predictive value of both tongue features and tumor markers.

Conclusion: TCM-based tongue diagnostics combined with ML provides a promising, non-invasive tool for early GC detection.

Keywords: Gastric cancer; Machine learning; Non-invasive diagnostics; Tongue feature; Traditional Chinese medicine (TCM).

MeSH terms

  • Aged
  • Early Detection of Cancer* / methods
  • Female
  • Gastroscopy
  • Humans
  • Machine Learning*
  • Male
  • Medicine, Chinese Traditional
  • Middle Aged
  • Prospective Studies
  • Stomach Neoplasms* / diagnosis
  • Stomach Neoplasms* / pathology
  • Tongue* / diagnostic imaging
  • Tongue* / pathology