Individualizing Risk Prediction for Positive Coronavirus Disease 2019 Testing: Results From 11,672 Patients

Chest. 2020 Oct;158(4):1364-1375. doi: 10.1016/j.chest.2020.05.580. Epub 2020 Jun 10.


Background: Coronavirus disease 2019 (COVID-19) is sweeping the globe. Despite multiple case-series, actionable knowledge to tailor decision-making proactively is missing.

Research question: Can a statistical model accurately predict infection with COVID-19?

Study design and methods: We developed a prospective registry of all patients tested for COVID-19 in Cleveland Clinic to create individualized risk prediction models. We focus here on the likelihood of a positive nasal or oropharyngeal COVID-19 test. A least absolute shrinkage and selection operator logistic regression algorithm was constructed that removed variables that were not contributing to the model's cross-validated concordance index. After external validation in a temporally and geographically distinct cohort, the statistical prediction model was illustrated as a nomogram and deployed in an online risk calculator.

Results: In the development cohort, 11,672 patients fulfilled study criteria, including 818 patients (7.0%) who tested positive for COVID-19; in the validation cohort, 2295 patients fulfilled criteria, including 290 patients who tested positive for COVID-19. Male, African American, older patients, and those with known COVID-19 exposure were at higher risk of being positive for COVID-19. Risk was reduced in those who had pneumococcal polysaccharide or influenza vaccine or who were on melatonin, paroxetine, or carvedilol. Our model had favorable discrimination (c-statistic = 0.863 in the development cohort and 0.840 in the validation cohort) and calibration. We present sensitivity, specificity, negative predictive value, and positive predictive value at different prediction cutoff points. The calculator is freely available at

Interpretation: Prediction of a COVID-19 positive test is possible and could help direct health-care resources. We demonstrate relevance of age, race, sex, and socioeconomic characteristics in COVID-19 susceptibility and suggest a potential modifying role of certain common vaccinations and drugs that have been identified in drug-repurposing studies.

Keywords: COVID-19; infectious disease; predictive modeling; testing.

MeSH terms

  • Adult
  • Aged
  • Algorithms
  • Betacoronavirus*
  • COVID-19
  • Coronavirus Infections / complications
  • Coronavirus Infections / diagnosis*
  • Coronavirus Infections / epidemiology
  • Female
  • Humans
  • Logistic Models
  • Male
  • Middle Aged
  • Models, Statistical
  • Pandemics
  • Pneumonia, Viral / complications
  • Pneumonia, Viral / diagnosis*
  • Pneumonia, Viral / epidemiology
  • Predictive Value of Tests
  • Retrospective Studies
  • Risk Factors
  • SARS-CoV-2