In this work, we are interested in predicting the diagnostic statuses of potentially neurodegenerated patients using feature values derived from multi-modality neuroimaging data and biological data, which might be incomplete. Collecting the feature values into a matrix, with each row containing a feature vector of a sample, we propose a framework to predict the corresponding associated multiple target outputs (e.g., diagnosis label and clinical scores) from this feature matrix by performing matrix shrinkage following matrix completion. Specifically, we first combine the feature and target output matrices into a large matrix and then partition this large incomplete matrix into smaller submatrices, each consisting of samples with complete feature values (corresponding to a certain combination of modalities) and target outputs. Treating each target output as the outcome of a prediction task, we apply a 2-step multi-task learning algorithm to select the most discriminative features and samples in each submatrix. Features and samples that are not selected in any of the submatrices are discarded, resulting in a shrunk version of the original large matrix. The missing feature values and unknown target outputs of the shrunk matrix is then completed simultaneously. Experimental results using the ADNI dataset indicate that our proposed framework achieves higher classification accuracy at a greater speed when compared with conventional imputation-based classification methods and also yields competitive performance when compared with the state-of-the-art methods.
Keywords: Classification; Data imputation; Matrix completion; Multi-task learning.
Copyright © 2014 Elsevier Inc. All rights reserved.