Performance and Consistency of ChatGPT-4 Versus Otolaryngologists: A Clinical Case Series

Jérôme R Lechien; Mattheuw R Naunheim; Antonino Maniaci; Thomas Radulesco; Alberto M Saibene; Carlos M Chiesa-Estomba; Luigi A Vaira

doi:10.1002/ohn.759

Performance and Consistency of ChatGPT-4 Versus Otolaryngologists: A Clinical Case Series

Otolaryngol Head Neck Surg. 2024 Apr 9. doi: 10.1002/ohn.759. Online ahead of print.

Authors

Jérôme R Lechien^{1

2

3

4}, Mattheuw R Naunheim^{1

5}, Antonino Maniaci^{1

6}, Thomas Radulesco^{1

7}, Alberto M Saibene^{1

8}, Carlos M Chiesa-Estomba^{1

9}, Luigi A Vaira^{1

10

11}

Affiliations

¹ Research Committee of Young Otolaryngologists of the International Federation of Otorhinolaryngological Societies (IFOS), Paris, France.
² Division of Laryngology and Broncho-Esophagology, Department of Otolaryngology-Head Neck Surgery, EpiCURA Hospital, UMONS Research Institute for Health Sciences and Technology, University of Mons (UMons), Mons, Belgium.
³ Department of Otorhinolaryngology and Head and Neck Surgery, Foch Hospital, Phonetics and Phonology Laboratory (UMR 7018 CNRS, Université Sorbonne Nouvelle/Paris 3), Paris Saclay University, Paris, France.
⁴ Department of Otorhinolaryngology and Head and Neck Surgery, CHU Saint-Pierre, Brussels, Belgium.
⁵ Department of Otolaryngology, Massachusetts Eye and Ear, Harvard Medical School, Boston, Massachusetts, USA.
⁶ Department of medicine and surgery, Faculty of Medicine and Surgery, University of Enna "Kore", Enna, Italy.
⁷ ENT-HNS Department, APHM, CNRS, IUSTI, La Conception University Hospital, Aix Marseille Univ, Marseille, France.
⁸ Otolaryngology Unit, Department of Health Sciences, ASST Santi Paolo E Carlo, Università Degli Studi Di Milano, Milan, Italy.
⁹ Department of Otorhinolaryngology-Head and Neck Surgery, Hospital Universitario Donostia, San Sebastian, Spain.
¹⁰ Maxillofacial Surgery Operative Unit, Department of Medicine, Surgery and Pharmacy, University of Sassari, Sassari, Italy.
¹¹ Department of Biomedical Sciences, PhD School of Biomedical Sciences, University of Sassari, Sassari, Italy.

PMID: 38591726
DOI: 10.1002/ohn.759

Abstract

Objective: To study the performance of Chatbot Generative Pretrained Transformer-4 (ChatGPT-4) in the management of cases in otolaryngology-head and neck surgery.

Study design: Prospective case series.

Setting: Multicenter University Hospitals.

Methods: History, clinical, physical, and additional examinations of adult outpatients consulting in otolaryngology departments of CHU Saint-Pierre and Dour Medical Center were presented to ChatGPT-4, which was interrogated for differential diagnoses, management, and treatment(s). According to specialty, the ChatGPT-4 responses were assessed by 2 distinct, blinded board-certified otolaryngologists with the Artificial Intelligence Performance Instrument.

Results: One hundred cases were presented to ChatGPT-4. ChaGPT-4 indicated a mean of 3.34 (95% confidence interval [CI]: 3.09, 3.59) additional examinations per patient versus 2.10 (95% CI: 1.76, 2.34; P = .001) for the practitioners. There was strong consistency (k > 0.600) between otolaryngologists and ChatGPT-4 for the indication of upper aerodigestive tract endoscopy, positron emission tomography and computed tomography, audiometry, tympanometry, and psychophysical evaluations. Primary diagnosis was correctly performed by ChatGPT-4 in 38% to 86% of cases depending on subspecialty. Additional examinations indicated by ChatGPT-4 were pertinent and necessary in 8% to 31% of cases, while the treatment regimen was pertinent in 12% to 44% of cases. The performance of ChatGPT-4 was not influenced by the human-reported level of difficulty of clinical cases.

Conclusion: ChatGPT-4 may be a promising adjunctive tool in otolaryngology, providing extensive documentation about additional examinations, primary and differential diagnoses, and treatments. The ChatGPT-4 is more effective in providing a primary diagnosis, and less effective in the selection of additional examinations and treatments.

Keywords: ChatGPT‐4; artificial intelligence; head neck surgery; otolaryngology; performance.