Content validity of AI-generated stuttering assessment and intervention programs based on expert review: A comparative analysis across age groups and language versions

J Fluency Disord. 2026 Mar:87:106186. doi: 10.1016/j.jfludis.2025.106186. Epub 2025 Dec 3.

Abstract

Purpose: This study aimed to evaluate the content validity and inter-rater reliability of stuttering assessment and intervention programs generated by artificial intelligence (GPT-4) in both Turkish and English for preschool, school-age, and adult populations. It also examined whether linguistic or cultural differences affected expert evaluations.

Methods: Twelve AI-generated programs (six in Turkish, six in English) were reviewed by twelve certified speech-language pathologists specializing in fluency disorders. Each item was rated using a 5-point Likert scale. Descriptive statistics, Cronbach's Alpha, and Intraclass Correlation Coefficients (ICC) were calculated to assess consistency and reliability.

Results: The majority of items were rated as appropriate or highly appropriate (M = 4.6-4.9). The overall reliability among raters was poor (ICC = 0.45), while single-rater reliability was higher (ICC = 0.65). Only a small number of items were flagged for revision, typically involving emotional or contextual components. Experts noted that English versions tended to be more detailed and literature-consistent, whereas certain Turkish terms required clearer cultural adaptation.

Conclusion: GPT-4 can produce clinically relevant and linguistically accurate stuttering materials when paired with expert review. However, human validation remains essential to refine affective and culture-specific elements. These findings support the integration of AI-assisted tools in multilingual clinical content development.

Keywords: Artificial intelligence; Content validity; Stuttering.

Publication types

  • Comparative Study

MeSH terms

  • Adolescent
  • Adult
  • Age Factors
  • Artificial Intelligence*
  • Child
  • Child, Preschool
  • Female
  • Humans
  • Language
  • Male
  • Middle Aged
  • Reproducibility of Results
  • Speech-Language Pathology
  • Stuttering* / diagnosis
  • Stuttering* / therapy
  • Turkey
  • Young Adult