ChatGPT Influence on Medical Decision-Making, Bias, and Equity: A Randomized Study of Clinicians Evaluating Clinical Vignettes

Ethan Goh; Bryan Bunning; Elaine Khoong; Robert Gallo; Arnold Milstein; Damon Centola; Jonathan H Chen

doi:10.1101/2023.11.24.23298844

ChatGPT Influence on Medical Decision-Making, Bias, and Equity: A Randomized Study of Clinicians Evaluating Clinical Vignettes

medRxiv [Preprint]. 2023 Nov 27:2023.11.24.23298844. doi: 10.1101/2023.11.24.23298844.

Authors

Ethan Goh^{1

2}, Bryan Bunning¹, Elaine Khoong³, Robert Gallo^{1

4}, Arnold Milstein², Damon Centola⁵, Jonathan H Chen^{1

6

2}

Affiliations

¹ Stanford Biomedical Informatics Research, Stanford University, Stanford, CA.
² Stanford Clinical Excellence Research Center, Stanford University, Stanford, CA.
³ UCSF Center for Vulnerable Populations at San Francisco General Hospital, SF, CA.
⁴ Center for Innovation to Implementation, VA Palo Alto Health Care System, PA, CA.
⁵ Communication, Sociology and Engineering, University of Pennsylvania, PA.
⁶ Division of Hospital Medicine, Stanford University, Stanford, CA.

Abstract

In a randomized, pre-post intervention study, we evaluated the influence of a large language model (LLM) generative AI system on accuracy of physician decision-making and bias in healthcare. 50 US-licensed physicians reviewed a video clinical vignette, featuring actors representing different demographics (a White male or a Black female) with chest pain. Participants were asked to answer clinical questions around triage, risk, and treatment based on these vignettes, then asked to reconsider after receiving advice generated by ChatGPT+ (GPT4). The primary outcome was the accuracy of clinical decisions based on pre-established evidence-based guidelines. Results showed that physicians are willing to change their initial clinical impressions given AI assistance, and that this led to a significant improvement in clinical decision-making accuracy in a chest pain evaluation scenario without introducing or exacerbating existing race or gender biases. A survey of physician participants indicates that the majority expect LLM tools to play a significant role in clinical decision making.

Publication types

Preprint

Abstract

Publication types

Grants and funding