Clinical and inflammatory features based machine learning model for fatal risk prediction of hospitalized COVID-19 patients: results from a retrospective cohort study

Ann Med. 2021 Dec;53(1):257-266. doi: 10.1080/07853890.2020.1868564.

Abstract

Objectives: To appraise effective predictors for COVID-19 mortality in a retrospective cohort study.

Methods: A total of 1270 COVID-19 patients, including 984 admitted in Sino French New City Branch (training and internal validation sets randomly split at 7:3 ratio) and 286 admitted in Optical Valley Branch (external validation set) of Wuhan Tongji hospital, were included in this study. Forty-eight clinical and laboratory features were screened with LASSO method. Further multi-tree extreme gradient boosting (XGBoost) machine learning-based model was used to rank importance of features selected from LASSO and subsequently constructed death risk prediction model with simple-tree XGBoost model. Performances of models were evaluated by AUC, prediction accuracy, precision, and F1 scores.

Results: Six features, including disease severity, age, levels of high-sensitivity C-reactive protein (hs-CRP), lactate dehydrogenase (LDH), ferritin, and interleukin-10 (IL-10), were selected as predictors for COVID-19 mortality. Simple-tree XGBoost model conducted by these features can predict death risk accurately with >90% precision and >85% sensitivity, as well as F1 scores >0.90 in training and validation sets.

Conclusion: We proposed the disease severity, age, serum levels of hs-CRP, LDH, ferritin, and IL-10 as significant predictors for death risk of COVID-19, which may help to identify the high-risk COVID-19 cases. KEY MESSAGES A machine learning method is used to build death risk model for COVID-19 patients. Disease severity, age, hs-CRP, LDH, ferritin, and IL-10 are death risk factors. These findings may help to identify the high-risk COVID-19 cases.

Keywords: COVID-19; extreme gradient boosting; fatal risk; machine learning.

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • C-Reactive Protein / metabolism
  • COVID-19 / epidemiology
  • COVID-19 / metabolism
  • COVID-19 / mortality*
  • COVID-19 / physiopathology
  • Cardiovascular Diseases / epidemiology
  • China / epidemiology
  • Clinical Decision Rules*
  • Cohort Studies
  • Comorbidity
  • Diabetes Mellitus / epidemiology
  • Female
  • Ferritins / metabolism
  • Hospitalization*
  • Humans
  • Hypertension / epidemiology
  • Interleukin-10 / metabolism
  • L-Lactate Dehydrogenase / metabolism
  • Machine Learning*
  • Male
  • Middle Aged
  • Prognosis
  • Reproducibility of Results
  • Retrospective Studies
  • SARS-CoV-2
  • Severity of Illness Index

Substances

  • Interleukin-10
  • C-Reactive Protein
  • Ferritins
  • L-Lactate Dehydrogenase