Finding unexpected patterns in microarray data

Plant Physiol. 2003 Dec;133(4):1717-25. doi: 10.1104/pp.103.028753.

Abstract

We describe the performance of a protocol based on the sequential application of unsupervised and supervised methods to analyze microarray samples defined by a combination of factors. Correspondence analysis is used to visualize the emerging patterns of three set of novel or previously published data: photoreceptor mutants of Arabidopsis grown under different light/dark conditions, Arabidopsis exposed to different types of biotic and abiotic stress, and human acute leukemia. We find, for instance, that light has a dramatic effect on plants despite the absence of the four major photoreceptors, that bacterial-, fungal-, and viral-induced responses converge at later stages of attack, and that sample preparation procedures used in different hospitals have large effects on transcriptome patterns. We use canonical discriminant analysis to identify the genes associated with these patters and hierarchical clustering to find groups of coregulated genes that are easily visualized in a second round of correspondence analysis and ordered tables. The unconventional combination of standard descriptive multivariate methods offers a previously unrecognized tool to uncover unexpected information.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arabidopsis / genetics*
  • Arabidopsis / physiology
  • Arabidopsis Proteins / genetics
  • Arabidopsis Proteins / physiology
  • Computational Biology / methods
  • Gene Deletion
  • Genes, Plant / genetics
  • Light
  • Oligonucleotide Array Sequence Analysis*
  • Pattern Recognition, Automated
  • Photosynthetic Reaction Center Complex Proteins / genetics
  • Photosynthetic Reaction Center Complex Proteins / physiology
  • Transcription, Genetic

Substances

  • Arabidopsis Proteins
  • Photosynthetic Reaction Center Complex Proteins