Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study

BMJ Open. 2015 Mar 9;5(3):e006481. doi: 10.1136/bmjopen-2014-006481.


Background: Data privacy is a major concern in spatial epidemiology because exact residential locations or parts of participants' addresses such as street or zip codes are used to perform geospatial analyses. To overcome this concern, different levels of aggregation such as census districts or zip code areas are mainly used, though any spatial aggregation leads to a loss of spatial variability. For the assessment of urban opportunities for physical activity that was conducted in the IDEFICS (Identification and prevention of dietary- and lifestyle-induced health effects in children and infants) study, macrolevel analyses were performed, but the use of exact residential addresses for micro-level analyses was not permitted by the responsible office for data protection. We therefore implemented a spatial blurring to anonymise address coordinates depending on the underlying population density.

Methods: We added a standard Gaussian distributed error to individual address coordinates with the variance σ² depending on the population density and on the chosen k-anonymity. 1000 random point locations were generated and repeatedly blurred 100 times to obtain anonymised locations. For each location 1 km network-dependent neighbourhoods were used to calculate walkability indices. Indices of blurred locations were compared to indices based on their sampling origins to determine the effect of spatial blurring on the assessment of the built environment.

Results: Spatial blurring decreased with increasing population density. Similarly, mean differences in walkability indices also decreased with increasing population density. In particular for densely-populated areas with at least 1500 residents per km², differences between blurred locations and their sampling origins were small and did not affect the assessment of the built environment after spatial blurring.

Conclusions: This approach allowed the investigation of the built environment at a microlevel using individual network-dependent neighbourhoods, while ensuring data protection requirements. Minor influence of spatial blurring on the assessment of walkability was found that slightly affected the assessment of the built environment in sparsely-populated areas.

Keywords: IDEFICS study; built environment; data protection; geocoding; spatial blurring; walkability.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Confidentiality*
  • Environment Design
  • Geographic Mapping*
  • Health Services Accessibility / statistics & numerical data*
  • Humans
  • Population Density
  • Research Design
  • Residence Characteristics
  • Spatial Analysis