A multi-layer model for the early detection of COVID-19

J R Soc Interface. 2021 Aug;18(181):20210284. doi: 10.1098/rsif.2021.0284. Epub 2021 Aug 4.

Abstract

Current COVID-19 screening efforts mainly rely on reported symptoms and the potential exposure to infected individuals. Here, we developed a machine-learning model for COVID-19 detection that uses four layers of information: (i) sociodemographic characteristics of the individual, (ii) spatio-temporal patterns of the disease, (iii) medical condition and general health consumption of the individual and (iv) information reported by the individual during the testing episode. We evaluated our model on 140 682 members of Maccabi Health Services who were tested for COVID-19 at least once between February and October 2020. These individuals underwent, in total, 264 516 COVID-19 PCR tests, out of which 16 512 were positive. Our multi-layer model obtained an area under the curve (AUC) of 81.6% when evaluated over all the individuals in the dataset, and an AUC of 72.8% when only individuals who did not report any symptom were included. Furthermore, considering only information collected before the testing episode-i.e. before the individual had the chance to report on any symptom-our model could reach a considerably high AUC of 79.5%. Our ability to predict early on the outcomes of COVID-19 tests is pivotal for breaking transmission chains, and can be used for a more efficient testing policy.

Keywords: COVID-19; early detection; electronic medical records; machine learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Area Under Curve
  • COVID-19*
  • Humans
  • Machine Learning
  • SARS-CoV-2

Associated data

  • figshare/10.6084/m9.figshare.c.5527164