Information Quality and Readability: ChatGPT's Responses to the Most Common Questions About Spinal Cord Injury

World Neurosurg. 2024 Jan:181:e1138-e1144. doi: 10.1016/j.wneu.2023.11.062. Epub 2023 Nov 22.

Abstract

Objective: This study aimed to assess the quality, readability, and comprehension of texts generated by ChatGPT in response to commonly asked questions about spinal cord injury (SCI).

Methods: The study utilized Google Trends to identify the most frequently searched keywords related to SCI. The identified keywords were sequentially inputted into ChatGPT, and the resulting responses were assessed for quality using the Ensuring Quality Information for Patients (EQIP) tool. The readability of the texts was analyzed using the Flesch-Kincaid grade level and the Flesch-Kincaid reading ease parameters.

Results: The mean EQIP score of the texts was determined to be 43.02 ± 6.37, the Flesch-Kincaid reading ease score to be 26.24 ± 13.81, and the Flesch-Kincaid grade level was determined to be 14.84 ± 1.79. The analysis revealed significant concerns regarding the quality of texts generated by ChatGPT, indicating serious problems with readability and comprehension. The mean EQIP score was low, suggesting a need for improvement in the accuracy and reliability of the information provided. The Flesch-Kincaid grade level indicated a high linguistic complexity, requiring a level of education equivalent to approximately 14 to 15 years of formal education for comprehension.

Conclusions: The results of this study show heightened complexity in ChatGPT-generated SCI texts, surpassing optimal health communication readability. ChatGPT currently cannot substitute comprehensive medical consultations. Enhancing text quality could be attainable through dependence on credible sources, the establishment of a scientific board, and collaboration with expert teams. Addressing these concerns could improve text accessibility, empowering patients and facilitating informed decision-making in SCI.

Keywords: ChatGPT; Comprehension; Quality assessment; Readability; Spinal cord injury.

MeSH terms

  • Comprehension*
  • Educational Status
  • Health Literacy*
  • Humans
  • Internet
  • Reading
  • Reproducibility of Results