A validation of machine learning-based risk scores in the prehospital setting

Douglas Spangler; Thomas Hermansson; David Smekal; Hans Blomberg

doi:10.1371/journal.pone.0226518

A validation of machine learning-based risk scores in the prehospital setting

PLoS One. 2019 Dec 13;14(12):e0226518. doi: 10.1371/journal.pone.0226518. eCollection 2019.

Authors

Douglas Spangler¹, Thomas Hermansson², David Smekal^{1

2}, Hans Blomberg^{1

2}

Affiliations

¹ Uppsala Center for Prehospital Research, Department of Surgical Sciences-Anesthesia and Intensive Care, Uppsala University, Uppsala, Sweden.
² Uppsala Ambulance Service, Uppsala University Hospital, Uppsala, Sweden.

Abstract

Background: The triage of patients in prehospital care is a difficult task, and improved risk assessment tools are needed both at the dispatch center and on the ambulance to differentiate between low- and high-risk patients. This study validates a machine learning-based approach to generating risk scores based on hospital outcomes using routinely collected prehospital data.

Methods: Dispatch, ambulance, and hospital data were collected in one Swedish region from 2016-2017. Dispatch center and ambulance records were used to develop gradient boosting models predicting hospital admission, critical care (defined as admission to an intensive care unit or in-hospital mortality), and two-day mortality. Composite risk scores were generated based on the models and compared to National Early Warning Scores (NEWS) and actual dispatched priorities in a prospectively gathered dataset from 2018.

Results: A total of 38203 patients were included from 2016-2018. Concordance indexes (or areas under the receiver operating characteristics curve) for dispatched priorities ranged from 0.51-0.66, while those for NEWS ranged from 0.66-0.85. Concordance ranged from 0.70-0.79 for risk scores based only on dispatch data, and 0.79-0.89 for risk scores including ambulance data. Dispatch data-based risk scores consistently outperformed dispatched priorities in predicting hospital outcomes, while models including ambulance data also consistently outperformed NEWS. Model performance in the prospective test dataset was similar to that found using cross-validation, and calibration was comparable to that of NEWS.

Conclusions: Machine learning-based risk scores outperformed a widely-used rule-based triage algorithm and human prioritization decisions in predicting hospital outcomes. Performance was robust in a prospectively gathered dataset, and scores demonstrated adequate calibration. Future research should explore the robustness of these methods when applied to other settings, establish appropriate outcome measures for use in determining the need for prehospital care, and investigate the clinical impact of interventions based on these methods.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Aged
Ambulances / statistics & numerical data*
Critical Care / standards*
Emergency Service, Hospital / statistics & numerical data*
Female
Hospitalization
Humans
Machine Learning*
Male
Middle Aged
Needs Assessment / statistics & numerical data*
Prospective Studies
ROC Curve
Risk Assessment / methods*
Sweden
Triage / methods*

Grants and funding

HB received funding for this study from the Swedish Innovation Agency (https://www.vinnova.se, grant number 2017-04652). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.