Geocoding is a powerful tool for environmental exposure assessments that rely on spatial databases. Geocoding processes, locators, and reference datasets have improved over time; however, improvements have not been well-characterized. Enrollment addresses for the Agricultural Health Study, a cohort of pesticide applicators and their spouses in Iowa (IA) and North Carolina (NC), were geocoded in 2012-2016 and then again in 2019. We calculated distances between geocodes in the two periods. For a subset, we computed positional errors using "gold standard" rooftop coordinates (IA; N = 3566) or Global Positioning Systems (GPS) (IA and NC; N = 1258) and compared errors between periods. We used linear regression to model the change in positional error between time periods (improvement) by rural status and population density, and we used spatial relative risk functions to identify areas with significant improvement. Median improvement between time periods in IA was 41 m (interquartile range, IQR: -2 to 168) and 9 m (IQR: -80 to 133) based on rooftop coordinates and GPS, respectively. Median improvement in NC was 42 m (IQR: -1 to 109 m) based on GPS. Positional error was greater in rural and low-density areas compared to in towns and more densely populated areas. Areas of significant improvement in accuracy were identified and mapped across both states. Our findings underscore the importance of evaluating determinants and spatial distributions of errors in geocodes used in environmental epidemiology studies.
Keywords: environmental epidemiology; exposure assessment; geocoding; positional error; rural location; spatial analysis.