A Machine learning model for predicting sepsis based on an optimized assay for microbial cell-free DNA sequencing

Clin Chim Acta. 2024 Jun 1:559:119716. doi: 10.1016/j.cca.2024.119716. Epub 2024 May 4.

Abstract

Objective: To integrate an enhanced molecular diagnostic technique to develop and validate a machine-learning model for diagnosing sepsis.

Methods: We prospectively enrolled patients suspected of sepsis from August 2021 to August 2023. Various feature selection algorithms and machine learning models were used to develop the model. The best classifier was selected using 5-fold cross validation set and then was applied to assess the performance of the model in the testing set. Additionally, we employed the Shapley Additive exPlanations (SHAP) method to illustrate the effects of the features.

Results: We established an optimized mNGS assay and proposed using the copies of microbe-specific cell-free DNA per milliliter of plasma (CPM) as the detection signal to evaluate the real burden, with strong precision and high accuracy. In total, 237 patients were eligible for participation, which were randomly assigned to either the training set (70 %, n = 165) or the testing set (30 %, n = 72). The random forest classifier achieved accuracy, AUC and F1 scores of 0.830, 0.918 and 0.856, respectively, outperforming other machine learning models in the training set. Our model demonstrated clinical interpretability and achieved good prediction performance in differentiating between bacterial sepsis and non-sepsis, with an AUC value of 0.85 and an average precision of 0.91 in the testing set. Based on the SHAP value, the top nine features of the model were PCT, CPM, CRP, ALB, SBPmin, RRmax, CREA, PLT and HRmax.

Conclusion: We demonstrated the potential of machine-learning approaches for predicting bacterial sepsis based on optimized mcfDNA sequencing assay accurately.

Keywords: Diagnosis; Machine-learning model; Microbial cell-free DNA sequencing; Sepsis.

MeSH terms

  • Aged
  • Cell-Free Nucleic Acids* / blood
  • Female
  • Humans
  • Machine Learning*
  • Male
  • Middle Aged
  • Prospective Studies
  • Sepsis* / diagnosis
  • Sepsis* / microbiology
  • Sequence Analysis, DNA

Substances

  • Cell-Free Nucleic Acids