Systematic evaluation of land use regression models for NO₂

Meng Wang; Rob Beelen; Marloes Eeftens; Kees Meliefste; Gerard Hoek; Bert Brunekreef

doi:10.1021/es204183v

Systematic evaluation of land use regression models for NO₂

Environ Sci Technol. 2012 Apr 17;46(8):4481-9. doi: 10.1021/es204183v. Epub 2012 Apr 2.

Authors

Meng Wang¹, Rob Beelen, Marloes Eeftens, Kees Meliefste, Gerard Hoek, Bert Brunekreef

Affiliation

¹ Institute for Risk Assessment Sciences (IRAS), Division of Environmental Epidemiology, Utrecht University, Utrecht, The Netherlands.

PMID: 22435498
DOI: 10.1021/es204183v

Abstract

Land use regression (LUR) models have become popular to explain the spatial variation of air pollution concentrations. Independent evaluation is important. We developed LUR models for nitrogen dioxide (NO(2)) using measurements conducted at 144 sampling sites in The Netherlands. Sites were randomly divided into training data sets with a size of 24, 36, 48, 72, 96, 108, and 120 sites. LUR models were evaluated using (1) internal "leave-one-out-cross-validation (LOOCV)" within the training data sets and (2) external "hold-out" validation (HV) against independent test data sets. In addition, we calculated Mean Square Error based validation R(2)s. The mean adjusted model and LOOCV R(2) slightly decreased from 0.87 to 0.82 and 0.83 to 0.79, respectively, with an increasing number of training sites. In contrast, the mean HV R(2) was lowest (0.60) with the smallest training sets and increased to 0.74 with the largest training sets. Predicted concentrations were more accurate in sites with out of range values for prediction variables after changing these values to the minimum or maximum of the range observed in the corresponding training data set. LUR models for NO(2) perform less well, when evaluated against independent measurements, when they are based on relatively small training sets. In our specific application, models based on as few as 24 training sites, however, achieved acceptable hold out validation R(2)s of, on average, 0.60.

MeSH terms

Air Pollutants / analysis*
Environmental Monitoring
Geographic Information Systems
Models, Theoretical*
Motor Vehicles
Netherlands
Nitrogen Dioxide / analysis*
Regression Analysis

Substances

Air Pollutants
Nitrogen Dioxide