Prediction of impending central-line-associated bloodstream infections in hospitalized cardiac patients: development and testing of a machine-learning model

J Hosp Infect. 2022 Jun 20;127:44-50. doi: 10.1016/j.jhin.2022.06.003. Online ahead of print.


Background: While modelling of central-line-associated blood stream infection (CLABSI) risk factors is common, models that predict an impending CLABSI in real time are lacking.

Aim: To build a prediction model which identifies patients who will develop a CLABSI in the ensuing 24 h.

Methods: We collected variables potentially related to infection identification in all patients admitted to the cardiac intensive care unit or cardiac ward at Boston Children's Hospital in whom a central venous catheter (CVC) was in place between January 2010 and August 2020, excluding those with a diagnosis of bacterial endocarditis. We created models predicting whether a patient would develop CLABSI in the ensuing 24 h. We assessed model performance based on area under the curve (AUC), sensitivity and false-positive rate (FPR) of models run on an independent testing set (40%).

Findings: A total of 104,035 patient-days and 139,662 line-days corresponding to 7468 unique patients were included in the analysis. There were 399 positive blood cultures (0.38%), most commonly with Staphylococcus aureus (23% of infections). Major predictors included a prior history of infection, elevated maximum heart rate, elevated maximum temperature, elevated C-reactive protein, exposure to parenteral nutrition and use of alteplase for CVC clearance. The model identified 25% of positive cultures with an FPR of 0.11% (AUC = 0.82).

Conclusions: A machine-learning model can be used to predict 25% of patients with impending CLABSI with only 1.1/1000 of these predictions being incorrect. Once prospectively validated, this tool may allow for early treatment or prevention.

Keywords: Cardiac surgical procedures; Central line-associated bloodstream infection; Congenital; Machine learning; Predictive analytics; Random forest classification.