Learning a weighted meta-sample based parameter free sparse representation classification for microarray data

PLoS One. 2014 Aug 12;9(8):e104314. doi: 10.1371/journal.pone.0104314. eCollection 2014.

Abstract

Sparse representation classification (SRC) is one of the most promising classification methods for supervised learning. This method can effectively exploit discriminating information by introducing a [Symbol: see text]1 regularization terms to the data. With the desirable property of sparisty, SRC is robust to both noise and outliers. In this study, we propose a weighted meta-sample based non-parametric sparse representation classification method for the accurate identification of tumor subtype. The proposed method includes three steps. First, we extract the weighted meta-samples for each sub class from raw data, and the rationality of the weighting strategy is proven mathematically. Second, sparse representation coefficients can be obtained by [Symnbol: see text]1 regularization of underdetermined linear equations. Thus, data dependent sparsity can be adaptively tuned. A simple characteristic function is eventually utilized to achieve classification. Asymptotic time complexity analysis is applied to our method. Compared with some state-of-the-art classifiers, the proposed method has lower time complexity and more flexibility. Experiments on eight samples of publicly available gene expression profile data show the effectiveness of the proposed method.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Artificial Intelligence*
  • Data Mining
  • Datasets as Topic
  • Gene Expression Profiling*
  • Humans
  • Models, Statistical
  • Neoplasms / diagnosis
  • Neoplasms / genetics
  • Oligonucleotide Array Sequence Analysis*
  • Reproducibility of Results

Grants and funding

This work is supported by the Program for New Century Excellent Talents in University (Grant NCET-10-0365), National Nature Science Foundation of China (Grant 60973082, 11171369, 61272395, 61370171, 61300128), the National Nature Science Foundation of Hunan province (Grant 12JJ2041), the Planned Science and Technology Project of Hunan Province (Grant 2009FJ3195, 2012FJ2012) and supported by the Fundamental Research Funds for the Central Universities, Hunan university, Graduate research and innovation projects in Hunan Province (Grant CX2012B144). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.