Machine Learning Analysis Reveals Biomarkers for the Detection of Neurological Diseases

Front Mol Neurosci. 2022 May 31;15:889728. doi: 10.3389/fnmol.2022.889728. eCollection 2022.


It is critical to identify biomarkers for neurological diseases (NLDs) to accelerate drug discovery for effective treatment of patients of diseases that currently lack such treatments. In this work, we retrieved genotyping and clinical data from 1,223 UK Biobank participants to identify genetic and clinical biomarkers for NLDs, including Alzheimer's disease (AD), Parkinson's disease (PD), motor neuron disease (MND), and myasthenia gravis (MG). Using a machine learning modeling approach with Monte Carlo randomization, we identified a panel of informative diagnostic biomarkers for predicting AD, PD, MND, and MG, including classical liver disease markers such as alanine aminotransferase, alkaline phosphatase, and bilirubin. A multinomial model trained on accessible clinical markers could correctly predict an NLD diagnosis with an accuracy of 88.3%. We also explored genetic biomarkers. In a genome-wide association study of AD, PD, MND, and MG patients, we identified single nucleotide polymorphisms (SNPs) implicated in several craniofacial disorders such as apnoea and branchiootic syndrome. We found evidence for shared genetic risk loci among NLDs, including SNPs in cancer-related genes and SNPs known to be associated with non-brain cancers such as Wilms tumor, leukemia, and colon cancer. This indicates overlapping genetic characterizations among NLDs which challenges current clinical definitions of the neurological disorders. Taken together, this work demonstrates the value of data-driven approaches to identify novel biomarkers in the absence of any known or promising biomarkers.

Keywords: GWAS—genome-wide association study; UK Biobank; machine learning; neurodegeneration; systems biology.