Importance: Large language model (LLM) artificial intelligence (AI) systems have shown promise in diagnostic reasoning, but their utility in management reasoning with no clear right answers is unknown.
Objective: To determine whether LLM assistance improves physician performance on open-ended management reasoning tasks compared to conventional resources.
Design: Prospective, randomized controlled trial conducted from 30 November 2023 to 21 April 2024.
Setting: Multi-institutional study from Stanford University, Beth Israel Deaconess Medical Center, and the University of Virginia involving physicians from across the United States.
Participants: 92 practicing attending physicians and residents with training in internal medicine, family medicine, or emergency medicine.
Intervention: Five expert-developed clinical case vignettes were presented with multiple open-ended management questions and scoring rubrics created through a Delphi process. Physicians were randomized to use either GPT-4 via ChatGPT Plus in addition to conventional resources (e.g., UpToDate, Google), or conventional resources alone.
Main outcomes and measures: The primary outcome was difference in total score between groups on expert-developed scoring rubrics. Secondary outcomes included domain-specific scores and time spent per case.
Results: Physicians using the LLM scored higher compared to those using conventional resources (mean difference 6.5 %, 95% CI 2.7-10.2, p<0.001). Significant improvements were seen in management decisions (6.1%, 95% CI 2.5-9.7, p=0.001), diagnostic decisions (12.1%, 95% CI 3.1-21.0, p=0.009), and case-specific (6.2%, 95% CI 2.4-9.9, p=0.002) domains. GPT-4 users spent more time per case (mean difference 119.3 seconds, 95% CI 17.4-221.2, p=0.02). There was no significant difference between GPT-4-augmented physicians and GPT-4 alone (-0.9%, 95% CI -9.0 to 7.2, p=0.8).
Conclusions and relevance: LLM assistance improved physician management reasoning compared to conventional resources, with particular gains in contextual and patient-specific decision-making. These findings indicate that LLMs can augment management decision-making in complex cases.
Trial registration: ClinicalTrials.gov Identifier: NCT06208423; https://classic.clinicaltrials.gov/ct2/show/NCT06208423.