A multivariate analysis of CalEnviroScreen: comparing environmental and socioeconomic stressors versus chronic disease

Environ Health. 2017 Dec 13;16(1):131. doi: 10.1186/s12940-017-0344-z.


Background: The health-risk assessment paradigm is shifting from single stressor evaluation towards cumulative assessments of multiple stressors. Recent efforts to develop broad-scale public health hazard datasets provide an opportunity to develop and evaluate multiple exposure hazards in combination.

Methods: We performed a multivariate study of the spatial relationship between 12 indicators of environmental hazard, 5 indicators of socioeconomic hardship, and 3 health outcomes. Indicators were obtained from CalEnviroScreen (version 3.0), a publicly available environmental justice screening tool developed by the State of California Environmental Protection Agency. The indicators were compared to the total rate of hospitalization for 14 ICD-9 disease categories (a measure of disease burden) at the zip code tabulation area population level. We performed principal component analysis to visualize and reduce the CalEnviroScreen data and spatial autoregression to evaluate associations with disease burden.

Results: CalEnviroScreen was strongly associated with the first principal component (PC) from a principal component analysis (PCA) of all 20 variables (Spearman ρ = 0.95). In a PCA of the 12 environmental variables, two PC axes explained 43% of variance, with the first axis indicating industrial activity and air pollution, and the second associated with ground-level ozone, drinking water contamination and PM2.5. Mass of pesticides used in agriculture was poorly or negatively correlated with all other environmental indicators, and with the CalEnviroScreen calculation method, suggesting a limited ability of the method to capture agricultural exposures. In a PCA of the 5 socioeconomic variables, the first PC explained 66% of variance, representing overall socioeconomic hardship. In simultaneous autoregressive models, the first environmental and socioeconomic PCs were both significantly associated with the disease burden measure, but more model variation was explained by the socioeconomic PCs.

Conclusions: This study supports the use of CalEnviroScreen for its intended purpose of screening California regions for areas with high environmental exposure and population vulnerability. Study results further suggest a hypothesis that, compared to environmental pollutant exposure, socioeconomic status has greater impact on overall burden of disease.

Keywords: CalEnviroScreen; Census; Environmental exposure; Environmental justice; Health hazard; Multivariate analysis; Principal component analysis; Socioeconomic status; Spatial regression; Vulnerable populations.

MeSH terms

  • California
  • Chronic Disease
  • Cost of Illness*
  • Environmental Exposure*
  • Environmental Pollutants
  • Hospitalization
  • Humans
  • Models, Theoretical*
  • Multivariate Analysis
  • Principal Component Analysis
  • Socioeconomic Factors*
  • Vulnerable Populations*


  • Environmental Pollutants