Multivariate Analysis of Risk Factors of the COVID-19 Pandemic in the Community of Madrid, Spain

Int J Environ Res Public Health. 2021 Sep 1;18(17):9227. doi: 10.3390/ijerph18179227.


It has been more than one year since Chinese authorities identified a deadly new strain of coronavirus, SARS-CoV-2. Since then, the scientific work regarding the transmission risk factors of COVID-19 has been intense. The relationship between COVID-19 and environmental conditions is becoming an increasingly popular research topic. Based on the findings of the early research, we focused on the community of Madrid, Spain, which is one of the world's most significant pandemic hotspots. We employed different multivariate statistical analyses, including principal component analysis, analysis of variance, clustering, and linear regression models. Principal component analysis was employed in order to reduce the number of risk factors down to three new components that explained 71% of the original variance. Cluster analysis was used to delimit the territory of Madrid according to these new risk components. An ANOVA test revealed different incidence rates between the territories delimited by the previously identified components. Finally, a set of linear models was applied to demonstrate how environmental factors present a greater influence on COVID-19 infections than socioeconomic dimensions. This type of local research provides valuable information that could help societies become more resilient in the face of future pandemics.

Keywords: COVID-19; cluster analysis; environmental and socioeconomic risk factors; general linear model; principal component analysis; the community of Madrid.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19*
  • Humans
  • Multivariate Analysis
  • Pandemics*
  • Risk Factors
  • SARS-CoV-2
  • Spain / epidemiology