Cognitive Digital Biomarkers from Automated Transcription of Spoken Language

J Prev Alzheimers Dis. 2022;9(4):791-800. doi: 10.14283/jpad.2022.66.


Background: Although patients with Alzheimer's disease and other cognitive-related neurodegenerative disorders may benefit from early detection, development of a reliable diagnostic test has remained elusive. The penetration of digital voice-recording technologies and multiple cognitive processes deployed when constructing spoken responses might offer an opportunity to predict cognitive status.

Objective: To determine whether cognitive status might be predicted from voice recordings of neuropsychological testing.

Design: Comparison of acoustic and (para)linguistic variables from low-quality automated transcriptions of neuropsychological testing (n = 200) versus variables from high-quality manual transcriptions (n = 127). We trained a logistic regression classifier to predict cognitive status, which was tested against actual diagnoses.

Setting: Observational cohort study.

Participants: 146 participants in the Framingham Heart Study.

Measurements: Acoustic and either paralinguistic variables (e.g., speaking time) from automated transcriptions or linguistic variables (e.g., phrase complexity) from manual transcriptions.

Results: Models based on demographic features alone were not robust (area under the receiver-operator characteristic curve [AUROC] 0.60). Addition of clinical and standard acoustic features boosted the AUROC to 0.81. Additional inclusion of transcription-related features yielded an AUROC of 0.90.

Conclusions: The use of voice-based digital biomarkers derived from automated processing methods, combined with standard patient screening, might constitute a scalable way to enable early detection of dementia.

Keywords: AD screening; Dementia; biomarkers; predictive modeling.

Publication types

  • Observational Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Biomarkers
  • Cognition
  • Cognitive Dysfunction* / diagnosis
  • Humans
  • Language
  • Sensitivity and Specificity


  • Biomarkers