Aims: As one of the most fundamental questions in modern science, "what causes schizophrenia (SZ)" remains a profound mystery due to the absence of objective gene markers. The reproducibility of the gene signatures identified by independent studies is found to be extremely low due to the incapability of available feature selection methods and the lack of measurement on validating signatures' robustness. These irreproducible results have significantly limited our understanding of the etiology of SZ.
Methods: In this study, a new feature selection strategy was developed, and a comprehensive analysis was then conducted to ensure a reliable signature discovery. Particularly, the new strategy (a) combined multiple randomized sampling with consensus scoring and (b) assessed gene ranking consistency among different datasets, and a comprehensive analysis among nine independent studies was conducted.
Results: Based on a first-ever evaluation of methods' reproducibility that was cross-validated by nine independent studies, the newly developed strategy was found to be superior to the traditional ones. As a result, 33 genes were consistently identified from multiple datasets by the new strategy as differentially expressed, which might facilitate our understanding of the mechanism underlying the etiology of SZ.
Conclusion: A new strategy capable of enhancing the reproducibility of feature selection in current SZ research was successfully constructed and validated. A group of candidate genes identified in this study should be considered as great potential for revealing the etiology of SZ.
Keywords: reproducibility; schizophrenia; significant analysis of microarray; student's t test; transcriptomics.
© 2019 The Authors. CNS Neuroscience & Therapeutics Published by John Wiley & Sons Ltd.