Modified signal-to-noise: a new simple and practical gene filtering approach based on the concept of projective adaptive resonance theory (PART) filtering method

Bioinformatics. 2006 Jul 1;22(13):1662-4. doi: 10.1093/bioinformatics/btl156. Epub 2006 Apr 21.


Considering the recent advances in and the benefits of DNA microarray technologies, many gene filtering approaches have been employed for the diagnosis and prognosis of diseases. In our previous study, we developed a new filtering method, namely, the projective adaptive resonance theory (PART) filtering method. This method was effective in subclass discrimination. In the PART algorithm, the genes with a low variance in gene expression in either class, not both classes, were selected as important genes for modeling. Based on this concept, we developed novel simple filtering methods such as modified signal-to-noise (S2N') in the present study. The discrimination model constructed using these methods showed higher accuracy with higher reproducibility as compared with many conventional filtering methods, including the t-test, S2N, NSC and SAM. The reproducibility of prediction was evaluated based on the correlation between the sets of U-test p-values on randomly divided datasets. With respect to leukemia, lymphoma and breast cancer, the correlation was high; a difference of >0.13 was obtained by the constructed model by using <50 genes selected by S2N'. Improvement was higher in the smaller genes and such higher correlation was observed when t-test, NSC and SAM were used. These results suggest that these modified methods, such as S2N', have high potential to function as new methods for marker gene selection in cancer diagnosis using DNA microarray data.

Availability: Software is available upon request.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Computational Biology / methods*
  • Data Interpretation, Statistical
  • Gene Expression Profiling*
  • Gene Expression Regulation
  • Humans
  • Oligonucleotide Array Sequence Analysis
  • Pattern Recognition, Automated
  • Reproducibility of Results
  • Software