Predicting COVID-19 Outbreaks in Correctional Facilities Using Machine Learning

MDM Policy Pract. 2024 Jan 29;9(1):23814683231222469. doi: 10.1177/23814683231222469. eCollection 2024 Jan-Jun.

Abstract

Introduction. The risk of infectious disease transmission, including COVID-19, is disproportionately high in correctional facilities due to close living conditions, relatively low levels of vaccination, and reduced access to testing and treatment. While much progress has been made on describing and mitigating COVID-19 and other infectious disease risk in jails and prisons, there are open questions about which data can best predict future outbreaks. Methods. We used facility data and demographic and health data collected from 24 prison facilities in the Pennsylvania Department of Corrections from March 2020 to May 2021 to determine which sources of data best predict a coming COVID-19 outbreak in a prison facility. We used machine learning methods to cluster the prisons into groups based on similar facility-level characteristics, including size, rurality, and demographics of incarcerated people. We developed logistic regression classification models to predict for each cluster, before and after vaccine availability, whether there would be no cases, an outbreak defined as 2 or more cases, or a large outbreak, defined as 10 or more cases in the next 1, 2, and 3 d. We compared these predictions to data on outbreaks that occurred. Results. Facilities were divided into 8 clusters of sizes varying from 1 to 7 facilities per cluster. We trained 60 logistic regressions; 20 had test sets with between 35% and 65% of days with outbreaks detected. Of these, 8 logistic regressions correctly predicted the occurrence of an outbreak more than 55% of the time. The most common predictive feature was incident cases among the incarcerated population from 2 to 32 d prior. Other predictive features included the number of tests administered from 1 to 33 d prior, total population, test positivity rate, and county deaths, hospitalizations, and incident cases. Cumulative cases, vaccination rates, and race, ethnicity, or age statistics for incarcerated populations were generally not predictive. Conclusions. County-level measures of COVID-19, facility population, and test positivity rate appear as potential promising predictors of COVID-19 outbreaks in correctional facilities, suggesting that correctional facilities should monitor community transmission in addition to facility transmission to inform future outbreak response decisions. These efforts should not be limited to COVID-19 but should include any large-scale infectious disease outbreak that may involve institution-community transmission.

Highlights: The risk of infectious disease transmission, including COVID-19, is disproportionately high in correctional facilities.We used machine learning methods with data collected from 24 prison facilities in the Pennsylvania Department of Corrections to determine which sources of data best predict a coming COVID-19 outbreak in a prison facility.Key predictors included county-level measures of COVID-19, facility population, and the test positivity rate in a facility.Fortifying correctional facilities with the ability to monitor local community rates of infection (e.g., though improved interagency collaboration and data sharing) along with continued testing of incarcerated people and staff can help correctional facilities better predict-and respond to-future infectious disease outbreaks.

Keywords: COVID-19; corrections health; infectious disease prediction; machine learning.