Measuring gene expression divergence: the distance to keep

Biol Direct. 2010 Aug 6;5:51. doi: 10.1186/1745-6150-5-51.


Background: Gene expression divergence is a phenotypic trait reflecting evolution of gene regulation and characterizing dissimilarity between species and between cells and tissues within the same species. Several distance measures, such as Euclidean and correlation-based distances have been proposed for measuring expression divergence.

Results: We show that different distance measures identify different trends in gene expression patterns. When comparing orthologous genes in eight rat and human tissues, the Euclidean distance identified genes uniformly expressed in all tissues near the expression background as genes with the most conserved expression pattern. In contrast, correlation-based distance and generalized-average distance identified genes with concerted changes among homologous tissues as those most conserved. On the other hand, correlation-based distance, Euclidean distance and generalized-average distance highlight quite well the relatively high similarity of gene expression patterns in homologous tissues between species, compared to non-homologous tissues within species.

Conclusions: Different trends exist in the high-dimensional numeric data, and to highlight a particular trend an appropriate distance measure needs to be chosen. The choice of the distance measure for measuring expression divergence can be dictated by the expression patterns that are of interest in a particular study.

MeSH terms

  • Animals
  • Evolution, Molecular*
  • Gene Expression / genetics*
  • Humans
  • Phylogeny
  • Rats