Pandemic velocity: Forecasting COVID-19 in the US with a machine learning & Bayesian time series compartmental model

PLoS Comput Biol. 2021 Mar 29;17(3):e1008837. doi: 10.1371/journal.pcbi.1008837. eCollection 2021 Mar.


Predictions of COVID-19 case growth and mortality are critical to the decisions of political leaders, businesses, and individuals grappling with the pandemic. This predictive task is challenging due to the novelty of the virus, limited data, and dynamic political and societal responses. We embed a Bayesian time series model and a random forest algorithm within an epidemiological compartmental model for empirically grounded COVID-19 predictions. The Bayesian case model fits a location-specific curve to the velocity (first derivative) of the log transformed cumulative case count, borrowing strength across geographic locations and incorporating prior information to obtain a posterior distribution for case trajectories. The compartmental model uses this distribution and predicts deaths using a random forest algorithm trained on COVID-19 data and population-level characteristics, yielding daily projections and interval estimates for cases and deaths in U.S. states. We evaluated the model by training it on progressively longer periods of the pandemic and computing its predictive accuracy over 21-day forecasts. The substantial variation in predicted trajectories and associated uncertainty between states is illustrated by comparing three unique locations: New York, Colorado, and West Virginia. The sophistication and accuracy of this COVID-19 model offer reliable predictions and uncertainty estimates for the current trajectory of the pandemic in the U.S. and provide a platform for future predictions as shifting political and societal responses alter its course.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Bayes Theorem
  • COVID-19 / epidemiology*
  • COVID-19 / mortality*
  • COVID-19 / transmission
  • Computational Biology
  • Forecasting / methods*
  • Humans
  • Machine Learning
  • Models, Statistical*
  • Pandemics / statistics & numerical data*
  • SARS-CoV-2*
  • United States / epidemiology