Comparative analysis of AI-generated study guides in otolaryngology education

Am J Otolaryngol. 2025 Sep-Oct;46(5):104693. doi: 10.1016/j.amjoto.2025.104693. Epub 2025 Jun 20.

Abstract

Introduction: Resident physicians training in otolaryngology frequently utilize dense traditional textbooks like "Cummings: Otolaryngology Head and Neck Surgery," widely regarded as the gold standard for educational content in the field. However, integrating artificial intelligence (AI) into educational methods offers potential enhancements to traditional methods of trainee learning. This study evaluates the accuracy, relevance, and clarity of AI-generated study guides for otolaryngology residents and their efficacy in graduate level education.

Methods: Study guides for four rhinology chapters of "Cummings Otolaryngology Head and Neck Surgery" were generated using ChatGPT-4 by a non-expert in otolaryngology to ensure replicability. Multiple board-certified Rhinologists evaluated the study guides with a structured assessment form. The guides were rated on accuracy, relevancy, and clarity using a 4-point scale. The item-level content validity index (I-CVI) was calculated for each parameter.

Results: The mean scores for accuracy, relevancy, and clarity across all chapters were 3.45 ± 0.19, 3.64 ± 0.17, and 3.36 ± 0.08, respectively. I-CVI scores for accuracy, relevancy, and clarity ranged from 0.8 to 1.0, signifying that the content was valid. Reviewers praised the comprehensive nature and clear formatting, although they suggested incorporating more detailed explanations and visual aids.

Discussion: The findings demonstrate the potential of Large Language Models (LLMs) in generating high-quality content. AI-generated resources can reduce the burden on educators and provide tailored resources for residents. Future research is necessary to explore refined AI models and multimodal inputs to enhance educational outcomes. LLM's, such as OpenAI's GPT-4, have revolutionized the opportunity for personalized learning experiences for graduate level trainees.

Level of evidence: Level 5.

Keywords: Artificial intelligence; PS/QI; Resident education; Rhinology.

Publication types

  • Comparative Study

MeSH terms

  • Artificial Intelligence*
  • Curriculum
  • Education, Medical, Graduate* / methods
  • Humans
  • Internship and Residency* / methods
  • Otolaryngology* / education