Using Artificial Intelligence to Learn Optimal Regimen Plan for Alzheimer's Disease

medRxiv. 2023 Jan 29;2023.01.26.23285064. doi: 10.1101/2023.01.26.23285064. Preprint


Background: Alzheimer's Disease (AD) is a progressive neurological disorder with no specific curative medications. While only a few medications are approved by FDA (i.e., donepezil, galantamine, rivastigmine, and memantine) to relieve symptoms (e.g., cognitive decline), sophisticated clinical skills are crucial to optimize the appropriate regimens given the multiple coexisting comorbidities in this patient population.

Objective: Here, we propose a study to leverage reinforcement learning (RL) to learn the clinicians' decisions for AD patients based on the longitude records from Electronic Health Records (EHR).

Methods: In this study, we withdraw 1,736 patients fulfilling our criteria, from the Alzheimer's Disease Neuroimaging Initiative(ADNI) database. We focused on the two most frequent concomitant diseases, depression, and hypertension, thus resulting in five main cohorts, 1) whole data, 2) AD-only, 3) AD-hypertension, 4) AD-depression, and 5) AD-hypertension-depression. We modeled the treatment learning into an RL problem by defining the three factors (i.e., states, action, and reward) in RL in multiple strategies, where a regression model and a decision tree are developed to generate states, six main medications extracted (i.e., no drugs, cholinesterase inhibitors, memantine, hypertension drugs, a combination of cholinesterase inhibitors and memantine, and supplements or other drugs) are for action, and Mini-Mental State Exam (MMSE) scores are for reward.

Results: Given the proper dataset, the RL model can generate an optimal policy (regimen plan) that outperforms the clinician's treatment regimen. With the smallest data samples, the optimal-policy (i.e., policy iteration and Q-learning) gained a lesser reward than the clinician's policy (mean -2.68 and -2.76 vs . -2.66, respectively), but it gained more reward once the data size increased (mean -3.56 and -2.48 vs . -3.57, respectively).

Conclusions: Our results highlight the potential of using RL to generate the optimal treatment based on the patients' longitude records. Our work can lead the path toward the development of RL-based decision support systems which could facilitate the daily practice to manage Alzheimer's disease with comorbidities.

Publication types

  • Preprint