Gadoxetic acid-enhanced MRI for identifying cholangiocyte phenotype hepatocellular carcinoma by interpretable machine learning: individual application of SHAP

BMC Cancer. 2025 Apr 28;25(1):788. doi: 10.1186/s12885-025-14147-3.

Abstract

Purpose: Cholangiocyte phenotype hepatocellular carcinoma (HCC) is highly invasive. This study aims to develop and validate an optimal machine learning model to predict cholangiocyte phenotype HCC based on T1 mapping gadoxetic acid-enhanced MRI and to implement individual applications via the Shapley Additive explanation (SHAP).

Methods: We included 180 patients with histologically confirmed HCC from two institutions. Clinical and MRI imaging features were screened for predicting cholangiocyte phenotype hepatocellular carcinoma using Least Absolute Shrinkage and Selection Operator (LASSO) and the logistic regression analysis. Five machine learning models were constructed based on these features. A Kaplan-Meier survival analysis aims to compare prognostic differences between cholangiocyte phenotype-positive HCC groups and classical (cholangiocyte phenotype-negative) HCC groups, and was conducted to explore the prognostic information of the optimal model.

Results: The most significant clinicoradiological features, including the platelet-to-lymphocyte ratio (PLR), tumor capsule, target sign on hepatobiliary phase (HBP), and T1 relaxation time of 20 min (T1rt-20 min), were selected to construct the prediction model. Finally, we selected the eXtreme Gradient Boosting (XGBoost) model as the optimal predictive model, which achieved AUCs of 0.835, 0.830, 0.816 and 0.776 in training, internal validation, external validation, and prospective validation cohorts, respectively, for visual analysis via SHAP, in which T1rt-20 min made a significant contribution. Survival analysis showed a statistically significant difference in relapse-free survival (RFS) between cholangiocyte phenotype-positive HCC groups and classical HCC groups from institution I (hazard ratio [HR] 1.994; 95% CI, 1.059-3.758; P = 0.027), and the construction XGBoost model can be used to stratify RFS according to prognosis (HR, 1.986; 95% CI, 1.061-3.717; P = 0.029).

Conclusion: The machine learning model utilizing T1 mapping gadoxetic acid-enhanced MRI demonstrates significant potential in identifying cholangiocyte phenotype HCC. Furthermore, personalized prediction is enhanced through the application of SHAP, providing valuable insights to support clinical decision-making processes.

Keywords: Cholangiocyte phenotype; Hepatocellular carcinoma; Machine learning; Magnetic resonance imaging.

MeSH terms

  • Adult
  • Aged
  • Carcinoma, Hepatocellular* / diagnostic imaging
  • Carcinoma, Hepatocellular* / mortality
  • Carcinoma, Hepatocellular* / pathology
  • Contrast Media
  • Female
  • Gadolinium DTPA* / administration & dosage
  • Humans
  • Kaplan-Meier Estimate
  • Liver Neoplasms* / diagnostic imaging
  • Liver Neoplasms* / mortality
  • Liver Neoplasms* / pathology
  • Machine Learning*
  • Magnetic Resonance Imaging* / methods
  • Male
  • Middle Aged
  • Phenotype
  • Prognosis
  • Retrospective Studies

Substances

  • gadolinium ethoxybenzyl DTPA
  • Gadolinium DTPA
  • Contrast Media