Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns

Bioinformatics. 2002 May;18(5):725-34. doi: 10.1093/bioinformatics/18.5.725.

Abstract

Gene groups that are significantly related to a disease can be detected by conducting a series of gene expression experiments. This work is aimed at discovering special types of gene groups that satisfy the following property. In each group, its member genes are found to be one-to-one contained in pre-determined intervals of gene expression level with a large frequency in one class of cells but are never found unanimously in these intervals in the other class of cells. We call these gene groups emerging patterns, to emphasize the patterns' frequency changes between two classes of cells. We use effective discretization and gene selection methods to obtain the most discriminatory genes. We also use efficient algorithms to derive the patterns from these genes. According to our studies on the ALL/AML dataset and the colon tumor dataset, some patterns, which consist of one or more genes, can reach a high frequency of 90%, or even 100%. In other words, they nearly or fully dominate one class of cells, even though they rarely occur in the other class. The discovered patterns are used to classify new cells with a higher accuracy than other reported methods. Based on these patterns, we also conjecture the possibility of a personalized treatment plan which converts colon tumor cells into normal cells by modulating the expression levels of a few genes.

Publication types

  • Comparative Study

MeSH terms

  • Algorithms
  • Colonic Neoplasms / classification*
  • Colonic Neoplasms / genetics*
  • Colonic Neoplasms / therapy
  • Database Management Systems*
  • Databases, Genetic
  • Gene Expression Profiling / methods
  • Gene Expression Profiling / statistics & numerical data
  • Gene Expression Regulation, Neoplastic / genetics*
  • Genetic Therapy / methods*
  • Humans
  • Internet
  • Models, Genetic
  • Models, Statistical
  • Pattern Recognition, Automated*
  • Sensitivity and Specificity