Detection of COVID-19 Patients Using Machine Learning Techniques: A Nationwide Chilean Study

Int J Environ Res Public Health. 2022 Jun 30;19(13):8058. doi: 10.3390/ijerph19138058.


Epivigila is a Chilean integrated epidemiological surveillance system with more than 17,000,000 Chilean patient records, making it an essential and unique source of information for the quantitative and qualitative analysis of the COVID-19 pandemic in Chile. Nevertheless, given the extensive volume of data controlled by Epivigila, it is difficult for health professionals to classify vast volumes of data to determine which symptoms and comorbidities are related to infected patients. This paper aims to compare machine learning techniques (such as support-vector machine, decision tree and random forest techniques) to determine whether a patient has COVID-19 or not based on the symptoms and comorbidities reported by Epivigila. From the group of patients with COVID-19, we selected a sample of 10% confirmed patients to execute and evaluate the techniques. We used precision, recall, accuracy, F1-score, and AUC to compare the techniques. The results suggest that the support-vector machine performs better than decision tree and random forest regarding the recall, accuracy, F1-score, and AUC. Machine learning techniques help process and classify large volumes of data more efficiently and effectively, speeding up healthcare decision making.

Keywords: Epivigila; comorbidities; machine learning; symptoms.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19* / epidemiology
  • Chile / epidemiology
  • Humans
  • Machine Learning
  • Pandemics
  • Support Vector Machine

Grants and funding

This work was funded by ANID—Millennium Science Initiative Program—Millennium Nucleus on Sociomedicine—NCS2021_013.