Predicting Food Safety Compliance for Informed Food Outlet Inspections: A Machine Learning Approach

Int J Environ Res Public Health. 2021 Nov 30;18(23):12635. doi: 10.3390/ijerph182312635.


Consumer food environments have transformed dramatically in the last decade. Food outlet prevalence has increased, and people are eating food outside the home more than ever before. Despite these developments, national spending on food control has reduced. The National Audit Office report that only 14% of local authorities are up to date with food business inspections, exposing consumers to unknown levels of risk. Given the scarcity of local authority resources, this paper presents a data-driven approach to predict compliance for newly opened businesses and those awaiting repeat inspections. This work capitalizes on the theory that food outlet compliance is a function of its geographic context, namely the characteristics of the neighborhood within which it sits. We explore the utility of three machine learning approaches to predict non-compliant food outlets in England and Wales using openly accessible socio-demographic, business type, and urbanness features at the output area level. We find that the synthetic minority oversampling technique alongside a random forest algorithm with a 1:1 sampling strategy provides the best predictive power. Our final model retrieves and identifies 84% of total non-compliant outlets in a test set of 92,595 (sensitivity = 0.843, specificity = 0.745, precision = 0.274). The originality of this work lies in its unique and methodological approach which combines the use of machine learning with fine-grained neighborhood data to make robust predictions of compliance.

Keywords: food environments; food hygiene; food safety; machine learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Commerce*
  • Food
  • Food Safety*
  • Humans
  • Machine Learning
  • Residence Characteristics