Identifying schizophrenia subgroups using clustering and supervised learning

Schizophr Res. 2019 Dec;214:51-59. doi: 10.1016/j.schres.2019.05.044. Epub 2019 Aug 24.


Schizophrenia has a 1% incidence rate world-wide and those diagnosed present with positive (e.g. hallucinations, delusions), negative (e.g. apathy, asociality), and cognitive symptoms. However, both symptom burden and associated brain alterations are highly heterogeneous and intimately linked to prognosis. In this study, we present a method to predict individual symptom profiles by first deriving clinical subgroups and then using machine learning methods to perform subject-level classification based on magnetic resonance imaging (MRI) derived neuroanatomical measures. Symptomatic and MRI data of 167 subjects were used. Subgroups were defined using hierarchical clustering of clinical data resulting in 3 stable clusters: 1) high symptom burden, 2) predominantly positive symptom burden, and 3) mild symptom burden. Cortical thickness estimates were obtained in 78 regions of interest and were input, along with demographic data, into three machine learning models (logistic regression, support vector machine, and random forest) to predict subgroups. Random forest performance metrics for predicting the group membership of the high and mild symptom burden groups exceeded those of the baseline comparison of the entire schizophrenia population versus normal controls (AUC: 0.81 and 0.78 vs. 0.75). Additionally, an analysis of the most important features in the random forest classification demonstrated consistencies with previous findings of regional impairments and symptoms of schizophrenia.

Keywords: Clustering; Heterogeneity; MRI; Machine learning; Schizophrenia; Single-subject prediction.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Area Under Curve
  • Brain / diagnostic imaging*
  • Cluster Analysis*
  • Female
  • Humans
  • Image Interpretation, Computer-Assisted / methods
  • Magnetic Resonance Imaging / methods*
  • Male
  • Organ Size
  • Pattern Recognition, Automated / methods*
  • Psychiatric Status Rating Scales
  • Schizophrenia / diagnosis*
  • Schizophrenia / pathology
  • Severity of Illness Index
  • Supervised Machine Learning*

Grant support