Non-Stationary Model for Crime Rate Inference Using Modern Urban Data

IEEE Trans Big Data. 2019 Jun;5(2):180-194. doi: 10.1109/TBDATA.2017.2786405. Epub 2017 Dec 22.

Abstract

Crime is one of the most important social problems in the country, affecting public safety, children development, and adult socioeconomic status. Understanding what factors cause higher crime rate is critical for policy makers in their efforts to reduce crime and increase citizens' life quality. We tackle a fundamental problem in our paper: crime rate inference at the neighborhood level. Traditional approaches have used demographics and geographical influences to estimate crime rates in a region. With the fast development of positioning technology and prevalence of mobile devices, a large amount of modern urban data have been collected and such big data can provide new perspectives for understanding crime. In this paper, we use large-scale Point-Of-Interest data and taxi flow data in the city of Chicago, IL in the USA. We observe significantly improved performance in crime rate inference compared to using traditional features. Such an improvement is consistent over multiple years. We also show that these new features are significant in the feature importance analysis. The correlations between crime and various observed features are not constant over the whole city. In order to address this geospatial non-stationary property, we further employ the geographically weighted regression on top of negative binomial model (GWNBR). Experiments have shown that GWNBR outperforms the negative binomial model.

Keywords: Computer Crime inference; geographically weighted regression; negative binomial model; taxi flow.