pyGCluster, a novel hierarchical clustering approach

Bioinformatics. 2014 Mar 15;30(6):896-8. doi: 10.1093/bioinformatics/btt626. Epub 2013 Oct 31.


Summary: pyGCluster is a clustering algorithm focusing on noise injection for subsequent cluster validation. The reproducibility of a large amount of clusters obtained with agglomerative hierarchical clustering is assessed. Furthermore, a multitude of different distance-linkage combinations are evaluated. Finally, highly reproducible clusters are meta-clustered into communities. Graphical illustration of the results as node and expression maps is implemented.

Availability and implementation: pyGCluster requires Python 2.7, it is freely available at and published under MIT license. Dependencies are NumPy, SciPy and optionally fastcluster and rpy2.


Supplementary information: Supplementary data is available at Bioinformatics online and at

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Base Sequence
  • Cluster Analysis*
  • Humans
  • Oligonucleotide Array Sequence Analysis / methods*
  • Pattern Recognition, Automated / methods
  • Reproducibility of Results