The effect of administrative boundaries and geocoding error on cancer rates in California

Spat Spatiotemporal Epidemiol. 2012 Apr;3(1):39-54. doi: 10.1016/j.sste.2012.02.005. Epub 2012 Feb 10.


Geocoding is often used to produce maps of disease rates from the diagnosis addresses of incident cases to assist with disease surveillance, prevention, and control. In this process, diagnosis addresses are converted into latitude/longitude pairs which are then aggregated to produce rates at varying geographic scales such as Census tracts, neighborhoods, cities, counties, and states. The specific techniques used within geocoding systems have an impact on where the output geocode is located and can therefore have an effect on the derivation of disease rates at different geographic aggregations. This paper investigates how county-level cancer rates are affected by the choice of interpolation method when case data are geocoded to the ZIP code level. Four commonly used areal unit interpolation techniques are applied and the output of each is used to compute crude county-level five-year incidence rates of all cancers in California. We found that the rates observed for 44 out of the 58 counties in California vary based on which interpolation method is used, with rates in some counties increasing by nearly 400% between interpolation methods.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • California / epidemiology
  • Data Interpretation, Statistical
  • Epidemiologic Research Design*
  • Geographic Mapping*
  • Humans
  • Neoplasms / epidemiology*