Prognostic Assessment of COVID-19 in ICU by Machine Learning Methods: A Retrospective Study

J Med Internet Res. 2020 Oct 8. doi: 10.2196/23128. Online ahead of print.


Background: Patients with coronavirus disease (COVID-19) in ICU have a high mortality rate, and how to early assess the prognosis and carry out precise treatment is of great significance.

Objective: To use machine learning to construct a model for the analysis of risk factors and prediction of death among ICU patients with COVID-19.

Methods: In this retrospective study, 123 COVID-19 patients inthe ICU of Vulcan Hill Hospital were selected from the database, and data were randomly divided into a training data set (n = 98) and test data set (n = 25) with a 4:1 ratio. Significance tests, analysis of correlation and factor analysis were used to screen the 100 potential risk factors individually. Conventional logistic regression methods and four machine learning algorithms were used to construct the risk prediction model for the prognosis of COVID-19 patients in ICU. Performance of these machine learning models was measured by the area under the receiver operating characteristic curve (AUC). Model interpretation and model evaluation of the risk prediction model, such as calibration curve, SHAP, LIME, etc., were performed to ensure its stability and reliability.The outcome is based on the ICU death recorded from the database.

Results: Layer-by-layer screening of 100 potential risk factors finallyrevealed 8 important risk factors that were included in the risk prediction model: lymphocyte percentage (LYM%), prothrombin time (PT), lactate dehydrogenase (LDH), total bilirubin (T-Bil), percentage of eosinophils (EOS%), creatinine(Cr), neutrophil percentage (NEUT%), albumin (ALB) level. Finally, eXtreme Gradient Boosting (XGBoost) established by 8 important risk factors showed the best recognition ability in the training set of 5-fold cross validation (AUC=0.86) and the verification queue (AUC=0.92). The calibration curve showed that the risk predicted by the model was in good agreement with the actual risk. In addition, using SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) algorithms, feature interpretation and sample prediction interpretation algorithms of the XGBoost black box model were implemented. Additionally, the model has been translated into an online risk calculator that is freely available for the public usage (

Conclusions: The 8 factors XGBoost model predicts risk of death in ICU patients with COVID-19 well,which initially demonstrates stability and can be used effectively to predict COVID-19 prognosis in ICU patients.