Machine learning is a powerful tool that has previously been used to classify schizophrenia (SZ) patients from healthy controls (HC) using magnetic resonance images. Each study, however, uses different datasets, classification algorithms, and validation techniques. Here, we perform a critical appraisal of the accuracy of machine learning methodologies used in SZ/HC classifications studies by comparing three machine learning algorithms (logistic regression [LR], support vector machines [SVMs], and linear discriminant analysis [LDA]) on three independent datasets (435 subjects total) using two tissue density estimates and cortical thickness (CT). Performance is assessed using 10-fold cross-validation, as well as a held-out validation set. Classification using CT outperformed tissue densities, but there was no clear effect of dataset. LR, SVMs, and LDA each yielded the highest accuracies for a different feature set and validation paradigm, but most accuracies were between 55 and 70%, well below previously reported values. The highest accuracy achieved was 73.5% using CT data and an SVM. Taken together, these results illustrate some of the obstacles to constructing effective disease classifiers, and suggest that tissue densities and CT may not be sufficiently sensitive for SZ/HC classification given current available methodologies and sample sizes.
Keywords: Classification; Cortical thickness; Machine learning; Schizophrenia; Structural magnetic resonance imaging; Voxel-based morphometry.
Copyright © 2019 Elsevier B.V. All rights reserved.