Open Source Clustering Software

Bioinformatics. 2004 Jun 12;20(9):1453-4. doi: 10.1093/bioinformatics/bth078. Epub 2004 Feb 10.

Abstract

Summary: We have implemented k-means clustering, hierarchical clustering and self-organizing maps in a single multipurpose open-source library of C routines, callable from other C and C++ programs. Using this library, we have created an improved version of Michael Eisen's well-known Cluster program for Windows, Mac OS X and Linux/Unix. In addition, we generated a Python and a Perl interface to the C Clustering Library, thereby combining the flexibility of a scripting language with the speed of C.

Availability: The C Clustering Library and the corresponding Python C extension module Pycluster were released under the Python License, while the Perl module Algorithm::Cluster was released under the Artistic License. The GUI code Cluster 3.0 for Windows, Macintosh and Linux/Unix, as well as the corresponding command-line program, were released under the same license as the original Cluster code. The complete source code is available at http://bonsai.ims.u-tokyo.ac.jp/mdehoon/software/cluster. Alternatively, Algorithm::Cluster can be downloaded from CPAN, while Pycluster is also available as part of the Biopython distribution.

Publication types

  • Comparative Study
  • Evaluation Study
  • Validation Study

MeSH terms

  • Algorithms*
  • Cluster Analysis*
  • Gene Expression Profiling / methods*
  • Pattern Recognition, Automated / methods
  • Programming Languages*
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*
  • Software*