Prediction of outcomes after cardiac arrest by a generative artificial intelligence model

Simon A Amacher; Armon Arpagaus; Christian Sahmer; Christoph Becker; Sebastian Gross; Tabita Urben; Kai Tisljar; Raoul Sutter; Stephan Marsch; Sabina Hunziker

doi:10.1016/j.resplu.2024.100587

Prediction of outcomes after cardiac arrest by a generative artificial intelligence model

Resusc Plus. 2024 Feb 22:18:100587. doi: 10.1016/j.resplu.2024.100587. eCollection 2024 Jun.

Authors

Simon A Amacher^{1

2

3}, Armon Arpagaus², Christian Sahmer², Christoph Becker^{2

3}, Sebastian Gross², Tabita Urben², Kai Tisljar¹, Raoul Sutter^{1

4

5}, Stephan Marsch^{1

4}, Sabina Hunziker^{2

4

6}

Affiliations

¹ Intensive Care Medicine, Department of Acute Medical Care, University Hospital Basel, Basel, Switzerland.
² Medical Communication and Psychosomatic Medicine, University Hospital Basel, Basel, Switzerland.
³ Emergency Medicine, Department of Acute Medical Care, University Hospital Basel, Basel, Switzerland.
⁴ Medical Faculty, University of Basel, Basel, Switzerland.
⁵ Division of Neurophysiology, Department of Neurology, University Hospital Basel, Basel, Switzerland.
⁶ Post-Intensive Care Clinic, University Hospital Basel, Basel, Switzerland.

Abstract

Aims: To investigate the prognostic accuracy of a non-medical generative artificial intelligence model (Chat Generative Pre-Trained Transformer 4 - ChatGPT-4) as a novel aspect in predicting death and poor neurological outcome at hospital discharge based on real-life data from cardiac arrest patients.

Methods: This prospective cohort study investigates the prognostic performance of ChatGPT-4 to predict outcomes at hospital discharge of adult cardiac arrest patients admitted to intensive care at a large Swiss tertiary academic medical center (COMMUNICATE/PROPHETIC cohort study). We prompted ChatGPT-4 with sixteen prognostic parameters derived from established post-cardiac arrest scores for each patient. We compared the prognostic performance of ChatGPT-4 regarding the area under the curve (AUC), sensitivity, specificity, positive and negative predictive values, and likelihood ratios of three cardiac arrest scores (Out-of-Hospital Cardiac Arrest [OHCA], Cardiac Arrest Hospital Prognosis [CAHP], and PROgnostication using LOGistic regression model for Unselected adult cardiac arrest patients in the Early stages [PROLOGUE score]) for in-hospital mortality and poor neurological outcome.

Results: Mortality at hospital discharge was 43% (n = 309/713), 54% of patients (n = 387/713) had a poor neurological outcome. ChatGPT-4 showed good discrimination regarding in-hospital mortality with an AUC of 0.85, similar to the OHCA, CAHP, and PROLOGUE (AUCs of 0.82, 0.83, and 0.84, respectively) scores. For poor neurological outcome, ChatGPT-4 showed a similar prediction to the post-cardiac arrest scores (AUC 0.83).

Conclusions: ChatGPT-4 showed a similar performance in predicting mortality and poor neurological outcome compared to validated post-cardiac arrest scores. However, more research is needed regarding illogical answers for potential incorporation of an LLM in the multimodal outcome prognostication after cardiac arrest.

Keywords: Artificial intelligence; Cardiac arrest; Cardiopulmonary resuscitation; Mortality prediction; Neurological outcome.