Speech based natural language profile before, during and after the onset of psychosis: A cluster analysis

Acta Psychiatr Scand. 2024 Apr 10. doi: 10.1111/acps.13685. Online ahead of print.


Background and hypothesis: Speech markers are digitally acquired, computationally derived, quantifiable set of measures that reflect the state of neurocognitive processes relevant for social functioning. "Oddities" in language and communication have historically been seen as a core feature of schizophrenia. The application of natural language processing (NLP) to speech samples can elucidate even the most subtle deviations in language. We aim to determine if NLP based profiles that are distinctive of schizophrenia can be observed across the various clinical phases of psychosis.

Design: Our sample consisted of 147 participants and included 39 healthy controls (HC), 72 with first-episode psychosis (FEP), 18 in a clinical high-risk state (CHR), 18 with schizophrenia (SZ). A structured task elicited 3 minutes of speech, which was then transformed into quantitative measures on 12 linguistic variables (lexical, syntactic, and semantic). Cluster analysis that leveraged healthy variations was then applied to determine language-based subgroups.

Results: We observed a three-cluster solution. The largest cluster included most HC and the majority of patients, indicating a 'typical linguistic profile (TLP)'. One of the atypical clusters had notably high semantic similarity in word choices with less perceptual words, lower cohesion and analytical structure; this cluster was almost entirely composed of patients in early stages of psychosis (EPP - early phase profile). The second atypical cluster had more patients with established schizophrenia (SPP - stable phase profile), with more perceptual but less cognitive/emotional word classes, simpler syntactic structure, and a lack of sufficient reference to prior information (reduced givenness).

Conclusion: The patterns of speech deviations in early and established stages of schizophrenia are distinguishable from each other and detectable when lexical, semantic and syntactic aspects are assessed in the pursuit of 'formal thought disorder'.

Keywords: communication; computational; disorganization; early‐intervention; impoverishment; linguistics; thought.