The impact of weather condition and social activity on COVID-19 transmission in the United States

J Environ Manage. 2022 Jan 15;302(Pt B):114085. doi: 10.1016/j.jenvman.2021.114085. Epub 2021 Nov 11.

Abstract

The coronavirus disease 2019 (COVID-19) has been first reported in December 2019 and rapidly spread worldwide. As other severe acute respiratory syndromes, it is a widely discussed topic whether seasonality affects the COVID-19 infection spreading. This study presents two different approaches to analyse the impact of social activity factors and weather variables on daily COVID-19 cases at county level over the Continental U.S. (CONUS). The first one is a traditional statistical method, i.e., Pearson correlation coefficient, whereas the second one is a machine learning algorithm, i.e., random forest regression model. The Pearson correlation is analysed to roughly test the relationship between COVID-19 cases and the weather variables or the social activity factor (i.e. social distance index). The random forest regression model investigates the feasibility of estimating the number of county-level daily confirmed COVID-19 cases by using different combinations of eight factors (county population, county population density, county social distance index, air temperature, specific humidity, shortwave radiation, precipitation, and wind speed). Results show that the number of daily confirmed COVID-19 cases is weakly correlated with the social distance index, air temperature and specific humidity through the Pearson correlation method. The random forest model shows that the estimation of COVID-19 cases is more accurate with adding weather variables as input data. Specifically, the most important factors for estimating daily COVID-19 cases are the population and population density, followed by the social distance index and the five weather variables, with temperature and specific humidity being more critical than shortwave radiation, wind speed, and precipitation. The validation process shows that the general values of correlation coefficients between the daily COVID-19 cases estimated by the random forest model and the observed ones are around 0.85.

Keywords: COVID-19 transmission; Machine learning; Random forest regression model; Social activity factor; Weather condition.

MeSH terms

  • COVID-19*
  • Humans
  • Humidity
  • SARS-CoV-2
  • Temperature
  • United States
  • Weather