Clustering algorithms and other exploratory methods for microarray data analysis

Methods Inf Med. 2005;44(3):444-8.

Abstract

Objectives: We introduce methods for the exploratory analysis of microarray data, especially focusing on cluster algorithms. Benefits and problems are discussed.

Methods: We describe application and suitability of unsupervised learning methods for the classification of gene expression data. Cluster algorithms are treated in more detail, including assessment of cluster quality.

Results: When dealing with microarray data, most cluster algorithms must be applied with caution. As long as the structure of the true generating models of such data is not fully understood, the use of simple algorithms seems to be more appropriate than the application of complex black-box algorithms. New methods explicitly targeted to the analysis of microarray data are increasingly being developed in order to increase the amount of useful information extracted from the experiments.

Conclusions: Unsupervised methods can be a helpful tool for the analysis of microarray data, but a critical choice of the algorithm and a careful interpretation of the results are required in order to avoid false conclusions.

MeSH terms

  • Algorithms
  • Cluster Analysis*
  • Databases, Protein
  • Gene Expression Profiling / classification
  • Gene Expression Profiling / methods*
  • Genetic Research
  • Mathematical Computing*
  • Models, Genetic
  • Neoplasms / genetics
  • Oligonucleotide Array Sequence Analysis / classification
  • Oligonucleotide Array Sequence Analysis / methods*
  • Quality Control