Consensus framework for exploring microarray data using multiple clustering methods

OMICS. 2007 Spring;11(1):116-28. doi: 10.1089/omi.2006.0008.

Abstract

The large variety of clustering algorithms and their variants can be daunting to researchers wishing to explore patterns within their microarray datasets. Furthermore, each clustering method has distinct biases in finding patterns within the data, and clusterings may not be reproducible across different algorithms. A consensus approach utilizing multiple algorithms can show where the various methods agree and expose robust patterns within the data. In this paper, we present a software package - Consense, written for R/Bioconductor - that utilizes such an approach to explore microarray datasets. Consense produces clustering results for each of the clustering methods and produces a report of metrics comparing the individual clusterings. A feature of Consense is identification of genes that cluster consistently with an index gene across methods. Utilizing simulated microarray data, sensitivity of the metrics to the biases of the different clustering algorithms is explored. The framework is easily extensible, allowing this tool to be used by other functional genomic data types, as well as other high-throughput OMICS data types generated from metabolomic and proteomic experiments. It also provides a flexible environment to benchmark new clustering algorithms. Consense is currently available as an installable R/Bioconductor package (http://www.ohsucancer.com/isrdev/consense/).

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Cell Cycle
  • Cluster Analysis*
  • Computational Biology / methods*
  • Computer Simulation
  • Fungal Proteins / genetics
  • Gene Expression Profiling
  • Genomics / methods*
  • Oligonucleotide Array Sequence Analysis*
  • Pattern Recognition, Automated
  • Software

Substances

  • Fungal Proteins