A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data

Chaos Solitons Fractals. 2022 Mar:156:111779. doi: 10.1016/j.chaos.2021.111779. Epub 2022 Jan 5.

Abstract

During the COVID-19 pandemic, predicting case spikes at the local level is important for a precise, targeted public health response and is generally done with compartmental models. The performance of compartmental models is highly dependent on the accuracy of their assumptions about disease dynamics within a population; thus, such models are susceptible to human error, unexpected events, or unknown characteristics of a novel infectious agent like COVID-19. We present a relatively non-parametric random forest model that forecasts the number of COVID-19 cases at the U.S. county level. Its most prioritized training features are derived from easily accessible, standard epidemiological data (i.e., regional test positivity rate) and the effective reproduction number ( R t ) from compartmental models. A novel input training feature is case projections generated by aligning estimated effective reproduction number (pre-computed by COVIDActNow.org) with real time testing data until maximally correlated, helping our model fit better to the epidemic's trajectory as ascertained by traditional models. Poor reliability of R t is partially mitigated with dynamic population mobility and prevalence and mortality of non-COVID-19 diseases to gauge population disease susceptibility. The model was used to generate forecasts for 1, 2, 3, and 4 weeks into the future for each reference week within 11/01/2020 - 01/10/2021 for 3068 counties. Over this time period, it maintained a mean absolute error (MAE) of less than 300 weekly cases/100,000 and consistently outperformed or performed comparably with gold-standard compartmental models. Furthermore, it holds great potential in ensemble modeling due to its potential for a more expansive training feature set while maintaining good performance and limited resource utilization.

Keywords: COVID-19; Compartmental model; Mobility; Random forest; US county.