Human gene correlation analysis (HGCA): a tool for the identification of transcriptionally co-expressed genes

BMC Res Notes. 2012 Jun 6:5:265. doi: 10.1186/1756-0500-5-265.


Background: Bioinformatics and high-throughput technologies such as microarray studies allow the measure of the expression levels of large numbers of genes simultaneously, thus helping us to understand the molecular mechanisms of various biological processes in a cell.

Findings: We calculate the Pearson Correlation Coefficient (r-value) between probe set signal values from Affymetrix Human Genome Microarray samples and cluster the human genes according to the r-value correlation matrix using the Neighbour Joining (NJ) clustering method. A hyper-geometric distribution is applied on the text annotations of the probe sets to quantify the term overrepresentations. The aim of the tool is the identification of closely correlated genes for a given gene of interest and/or the prediction of its biological function, which is based on the annotations of the respective gene cluster.

Conclusion: Human Gene Correlation Analysis (HGCA) is a tool to classify human genes according to their coexpression levels and to identify overrepresented annotation terms in correlated gene groups. It is available at:

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Computational Biology*
  • DEAD-box RNA Helicases / genetics
  • Databases, Genetic
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation*
  • Gene Regulatory Networks
  • HLA Antigens / genetics
  • High-Throughput Screening Assays / methods*
  • Humans
  • Intramolecular Oxidoreductases / genetics
  • Lipocalins / genetics
  • Metallothionein / genetics
  • Models, Genetic
  • Models, Statistical
  • Molecular Sequence Annotation
  • Oligonucleotide Array Sequence Analysis*
  • Promoter Regions, Genetic
  • Ribosomal Proteins / genetics
  • Transcription, Genetic*


  • HLA Antigens
  • Lipocalins
  • Ribosomal Proteins
  • Metallothionein
  • DDX4 protein, human
  • DEAD-box RNA Helicases
  • Intramolecular Oxidoreductases
  • prostaglandin R2 D-isomerase