Block-diagonal discriminant analysis and its bias-corrected rules

Stat Appl Genet Mol Biol. 2013 Jun;12(3):347-59. doi: 10.1515/sagmb-2012-0017.


High-throughput expression profiling allows simultaneous measure of tens of thousands of genes at once. These data have motivated the development of reliable biomarkers for disease subtypes identification and diagnosis. Many methods have been developed in the literature for analyzing these data, such as diagonal discriminant analysis, support vector machines, and k-nearest neighbor methods. The diagonal discriminant methods have been shown to perform well for high-dimensional data with small sample sizes. Despite its popularity, the independence assumption is unlikely to be true in practice. Recently, a gene module based linear discriminant analysis strategy has been proposed by utilizing the correlation among genes in discriminant analysis. However, the approach can be underpowered when the samples of the two classes are unbalanced. In this paper, we propose to correct the biases in the discriminant scores of block-diagonal discriminant analysis. In simulation studies, our proposed method outperforms other approaches in various settings. We also illustrate our proposed discriminant analysis method for analyzing microarray data studies.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Case-Control Studies
  • Computer Simulation
  • Data Interpretation, Statistical
  • Diagnosis, Differential
  • Discriminant Analysis
  • Gene Expression Profiling / methods*
  • Gene Regulatory Networks
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Lymphoma / diagnosis
  • Lymphoma / genetics
  • Lymphoma / metabolism
  • Models, Biological
  • Models, Statistical
  • Molecular Diagnostic Techniques / methods
  • Shock, Septic / genetics
  • Shock, Septic / metabolism