Automated profiling of spontaneous speech in primary progressive aphasia and behavioral-variant frontotemporal dementia: An approach based on usage-frequency

Cortex. 2020 Dec:133:103-119. doi: 10.1016/j.cortex.2020.08.027. Epub 2020 Sep 22.


Language production provides important markers of neurological health. One feature of impairments of language and cognition, such as those that occur in stroke aphasia or Alzheimer's disease, is an overuse of high frequency, "familiar" expressions. We used computerized analysis to profile narrative speech samples from speakers with variants of frontotemporal dementia (FTD), including subtypes of primary progressive aphasia (PPA). Analysis was performed on language samples from 29 speakers with semantic variant PPA (svPPA), 25 speakers with logopenic variant PPA (lvPPA), 34 speakers with non-fluent variant PPA (nfvPPA), 14 speakers with behavioral variant FTD (bvFTD) and 20 older normal controls (NCs). We used frequency and collocation strength measures to determine use of familiar words and word combinations. We also computed word counts, content word ratio and a combination ratio, a measure of the degree to which the individual produces connected language. All dementia subtypes differed significantly from NCs. The most discriminating variables were word count, combination ratio, and content word ratio, each of which distinguished at least one dementia group from NCs. All participants with PPA, but not participants with bvFTD, produced significantly more frequent forms at the level of content words, word combinations, or both. Each dementia group differed from the others on at least one variable, and language production variables correlated with established behavioral measures of disease progression. A machine learning classifier, using narrative speech variables, achieved 90% accuracy when classifying samples as NC or dementia, and 59.4% accuracy when matching samples to their diagnostic group. Automated quantification of spontaneous speech in both language-led and non-language led dementias, is feasible. It allows extraction of syndromic profiles that complement those derived from standardized tests, warranting further evaluation as candidate biomarkers. Inclusion of frequency-based language variables benefits profiling and classification.

Keywords: Dementia; Frontotemporal dementia; Language profiles; Primary progressive aphasia; Usage-frequency.

MeSH terms

  • Alzheimer Disease*
  • Aphasia, Primary Progressive*
  • Frontotemporal Dementia*
  • Humans
  • Language
  • Speech