The development of an automated machine learning pipeline for the detection of Alzheimer's Disease

Sci Rep. 2022 Oct 28;12(1):18137. doi: 10.1038/s41598-022-22979-3.


Although Alzheimer's disease is the most prevalent form of dementia, there are no treatments capable of slowing disease progression. A lack of reliable disease endpoints and/or biomarkers contributes in part to the absence of effective therapies. Using machine learning to analyze EEG offers a possible solution to overcome many of the limitations of current diagnostic modalities. Here we develop a logistic regression model with an accuracy of 81% that addresses many of the shortcomings of previous works. To our knowledge, no other study has been able to solve the following problems simultaneously: (1) a lack of automation and unbiased removal of artifacts, (2) a dependence on a high level of expertise in data pre-processing and ML for non-automated processes, (3) the need for very large sample sizes and accurate EEG source localization using high density systems, (4) and a reliance on black box ML approaches such as deep neural nets with unexplainable feature selection. This study presents a proof-of-concept for an automated and scalable technology that could potentially be used to diagnose AD in clinical settings as an adjunct to conventional neuropsychological testing, thus enhancing efficiency, reproducibility, and practicality of AD diagnosis.

MeSH terms

  • Alzheimer Disease* / diagnosis
  • Artifacts
  • Biomarkers
  • Humans
  • Machine Learning
  • Reproducibility of Results


  • Biomarkers