Clustering of housekeeping genes provides a unified model of gene order in the human genome

Nat Genet. 2002 Jun;31(2):180-3. doi: 10.1038/ng887. Epub 2002 May 6.


It is often supposed that, except for tandem duplicates, genes are randomly distributed throughout the human genome. However, recent analyses suggest that when all the genes expressed in a given tissue (notably placenta and skeletal muscle) are examined, these genes do not map to random locations but instead resolve to clusters. We have asked three questions: (i) is this clustering true for most tissues, or are these the exceptions; (ii) is any clustering simply the result of the expression of tandem duplicates and (iii) how, if at all, does this relate to the observed clustering of genes with high expression rates? We provide a unified model of gene clustering that explains the previous observations. We examined Serial Analysis of Gene Expression (SAGE) data for 14 tissues and found significant clustering, in each tissue, that persists even after the removal of tandem duplicates. We confirmed clustering by analysis of independent expressed-sequence tag (EST) data. We then tested the possibility that the human genome is organized into subregions, each specializing in genes needed in a given tissue. By comparing genes expressed in different tissues, we show that this is not the case: those genes that seem to be tissue-specific in their expression do not, as a rule, cluster. We report that genes that are expressed in most tissues (housekeeping genes) show strong clustering. In addition, we show that the apparent clustering of genes with high expression rates is a consequence of the clustering of housekeeping genes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Gene Order*
  • Genome, Human*
  • Humans
  • Organ Specificity / genetics