Research Letter: Application of GPT-4 to select next-step antidepressant treatment in major depression

medRxiv [Preprint]. 2023 Apr 18:2023.04.14.23288595. doi: 10.1101/2023.04.14.23288595.

Abstract

Introduction: Large language models perform well on a range of academic tasks including medical examinations. The performance of this class of models in psychopharmacology has not been explored.

Method: Chat GPT-plus, implementing the GPT-4 large language model, was presented with each of 10 previously-studied antidepressant prescribing vignettes in randomized order, with results regenerated 5 times to evaluate stability of responses. Results were compared to expert consensus.

Results: At least one of the optimal medication choices was included among the best choices in 38/50 (76%) vignettes: 5/5 for 7 vignettes, 3/5 for 1, and 0/5 for 2. At least one of the poor choice or contraindicated medications was included among the choices considered optimal or good in 24/50 (48%) of vignettes. The model provided as rationale for treatment selection multiple heuristics including avoiding prior unsuccessful medications, avoiding adverse effects based on comorbidities, and generalizing within medication class.

Conclusion: The model appeared to identify and apply a number of heuristics commonly applied in psychopharmacologic clinical practice. However, the inclusion of less optimal recommendations indicates that large language models may pose a substantial risk if routinely applied to guide psychopharmacologic treatment without further monitoring.

Publication types

  • Preprint