Building a predictive model to identify clinical indicators for COVID-19 using machine learning method

Med Biol Eng Comput. 2022 Jun;60(6):1763-1774. doi: 10.1007/s11517-022-02568-2. Epub 2022 Apr 25.

Abstract

Although some studies tried to identify risk factors for COVID-19, the evidence comparing COVID-19 and community-acquired pneumonia (CAP) is inconclusive, and CAP is the most common pneumonia with similar symptoms as COVID-19. We conducted a case-control study with 35 routine-collected clinical indicators and demographic factors to identify predictors for COVID-19 with CAP as controls. We randomly split the dataset into a training set (70%) and testing set (30%). We built Explainable Boosting Machine to select the important factors and built a decision tree on selected variables to interpret their relationships. The top five individual predictors of COVID-19 are albumin, total bilirubin, monocyte count, alanine aminotransferase, and percentage of monocyte with the importance scores ranging from 0.078 to 0.567. The top systematic predictors for COVID-19 are liver function, monocyte increasing, plasma protein, granulocyte, and renal function (importance scores ranging 0.009-0.096). We identified five combinations of important indicators to screen COVID-19 patients from CAP patients with differentiating abilities ranging 83.3-100%. An online predictive tool for our model was published. Certain clinical indicators collected routinely from most hospitals could help screen and distinguish COVID-19 from CAP. While further verification is needed, our findings and predictive tool could help screen suspected COVID-19 cases.

Keywords: COVID-19; Community-acquired pneumonia; Machine learning; Predictor.

MeSH terms

  • COVID-19* / diagnosis
  • Case-Control Studies
  • Humans
  • Machine Learning
  • Pneumonia* / diagnosis
  • Risk Factors