Mild cognitive impairment (MCI) is a prodromal stage of Alzheimer's disease (AD) that causes a significant burden in caregiving and medical costs. Clinically, the diagnosis of MCI is determined by the impairment statuses of five cognitive domains. If one of these cognitive domains is impaired, the patient is diagnosed with MCI, and if two out of the five domains are impaired, the patient is diagnosed with AD. In medical records, most of the time, the diagnosis of MCI/AD is given, but not the statuses of the five domains. We may treat the domain statuses as missing variables. This diagnostic procedure relates MCI/AD status modeling to multiple-instance learning, where each domain resembles an instance. However, traditional multiple-instance learning assumes common predictors among instances, but in our case, each domain is associated with different predictors. In this article, we generalized the multiple-instance logistic regression to accommodate the heterogeneity in predictors among different instances. The proposed model is dubbed heterogeneous-instance logistic regression and is estimated via the expectation-maximization algorithm because of the presence of the missing variables. We also derived two variants of the proposed model for the MCI and AD diagnoses. The proposed model is validated in terms of its estimation accuracy, latent status prediction, and robustness via extensive simulation studies. Finally, we analyzed the National Alzheimer's Coordinating Center-Uniform Data Set using the proposed model and demonstrated its potential.
Keywords: Alzheimer's disease; EM algorithm; logistic regression; mild cognitive impairment; multiple instance learning.
© 2024 John Wiley & Sons Ltd.