The early diagnosis of hepatocellular carcinoma (HCC) lacks biomarkers with a high sensitivity and specificity. This study was to find sensitive and specific diagnostic markers for HCC by analyzing the imbalance of serum metabolites in patients with HCC. Nontargeted and targeted metabolomic analyses were used to explore dysregulated metabolites, and many bile acids (such as DCA, GUDCA, GCDCA, GCA, TCDCA, TDCA, TCA, LCA, and TUDCA) and steroid hormones (such as DHEAS, DHEA, Aldo, Cortisone, and 18-OHF) were found to be dysregulated in HCC. A machine-learning model based on bile acids and steroid hormones was constructed using XGBoost to distinguish HCC from normal controls (NC), patients with chronic hepatitis B (CHB), and patients with metabolic dysfunction-associated steatotic liver disease (MASLD). We found that the levels of bile acids and steroid hormones in HCC patients were disturbed compared with NC, CHB, and MASLD patients. The XGBoost model showed strong diagnostic ability in the internal test subset of the training cohort (AUC = 0.876) and was verified in an independent test cohort (AUC = 0.813). It exhibited good diagnostic performance in the detection of early-stage and small-size HCC (AUC = 0.896 and AUC = 0.830) and performed better than the classical biomarker alpha-fetoprotein (AFP). In conclusion, our study established a novel XGBoost model based on bile acids and steroid hormones and might be helpful for the early diagnosis of HCC.
Keywords: bile acid; hepatocellular carcinoma (HCC); machine learning; metabolomic analysis; steroid hormone.