Machine learning for predicting cognitive deficits using auditory and demographic factors

PLoS One. 2024 May 14;19(5):e0302902. doi: 10.1371/journal.pone.0302902. eCollection 2024.


Importance: Predicting neurocognitive deficits using complex auditory assessments could change how cognitive dysfunction is identified, and monitored over time. Detecting cognitive impairment in people living with HIV (PLWH) is important for early intervention, especially in low- to middle-income countries where most cases exist. Auditory tests relate to neurocognitive test results, but the incremental predictive capability beyond demographic factors is unknown.

Objective: Use machine learning to predict neurocognitive deficits, using auditory tests and demographic factors.

Setting: The Infectious Disease Center in Dar es Salaam, Tanzania.

Participants: Participants were 939 Tanzanian individuals from Dar es Salaam living with and without HIV who were part of a longitudinal study. Patients who had only one visit, a positive history of ear drainage, concussion, significant noise or chemical exposure, neurological disease, mental illness, or exposure to ototoxic antibiotics (e.g., gentamycin), or chemotherapy were excluded. This provided 478 participants (349 PLWH, 129 HIV-negative). Participant data were randomized to training and test sets for machine learning.

Main outcome(s) and measure(s): The main outcome was whether auditory variables combined with relevant demographic variables could predict neurocognitive dysfunction (defined as a score of <26 on the Kiswahili Montreal Cognitive Assessment) better than demographic factors alone. The performance of predictive machine learning algorithms was primarily evaluated using the area under the receiver operational characteristic curve. Secondary metrics for evaluation included F1 scores, accuracies, and the Youden's indices for the algorithms.

Results: The percentage of individuals with cognitive deficits was 36.2% (139 PLWH and 34 HIV-negative). The Gaussian and kernel naïve Bayes classifiers were the most predictive algorithms for neurocognitive impairment. Algorithms trained with auditory variables had average area under the curve values of 0.91 and 0.87, F1 scores (metric for precision and recall) of 0.81 and 0.76, and average accuracies of 86.3% and 81.9% respectively. Algorithms trained without auditory variables as features were statistically worse (p < .001) in both the primary measure of area under the curve (0.82/0.78) and the secondary measure of accuracy (72.3%/74.5%) for the Gaussian and kernel algorithms respectively.

Conclusions and relevance: Auditory variables improved the prediction of cognitive function. Since auditory tests are easy-to-administer and often naturalistic tasks, they may offer objective measures or predictors of neurocognitive performance suitable for many global settings. Further research and development into using machine learning algorithms for predicting cognitive outcomes should be pursued.

MeSH terms

  • Adult
  • Cognitive Dysfunction* / diagnosis
  • Female
  • HIV Infections / complications
  • HIV Infections / psychology
  • Humans
  • Longitudinal Studies
  • Machine Learning*
  • Male
  • Middle Aged
  • Neuropsychological Tests
  • Tanzania / epidemiology