The complete human olfactory subgenome

Genome Res. 2001 May;11(5):685-702. doi: 10.1101/gr.171001.

Abstract

Olfactory receptors likely constitute the largest gene superfamily in the vertebrate genome. Here we present the nearly complete human olfactory subgenome elucidated by mining the genome draft with gene discovery algorithms. Over 900 olfactory receptor genes and pseudogenes (ORs) were identified, two-thirds of which were not annotated previously. The number of extrapolated ORs is in good agreement with previous theoretical predictions. The sequence of at least 63% of the ORs is disrupted by what appears to be a random process of pseudogene formation. ORs constitute 17 gene families, 4 of which contain more than 100 members each. "Fish-like" Class I ORs, previously considered a relic in higher tetrapods, constitute as much as 10% of the human repertoire, all in one large cluster on chromosome 11. Their lower pseudogene fraction suggests a functional significance. ORs are disposed on all human chromosomes except 20 and Y, and nearly 80% are found in clusters of 6-138 genes. A novel comparative cluster analysis was used to trace the evolutionary path that may have led to OR proliferation and diversification throughout the genome. The results of this analysis suggest the following genome expansion history: first, the generation of a "tetrapod-specific" Class II OR cluster on chromosome 11 by local duplication, then a single-step duplication of this cluster to chromosome 1, and finally an avalanche of duplication events out of chromosome 1 to most other chromosomes. The results of the data mining and characterization of ORs can be accessed at the Human Olfactory Receptor Data Exploratorium Web site (http://bioinfo.weizmann.ac.il/HORDE).

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Chromosome Mapping / methods
  • Chromosome Mapping / statistics & numerical data
  • Databases, Factual
  • Evolution, Molecular
  • GC Rich Sequence / genetics
  • Gene Frequency / genetics
  • Genome, Human*
  • Humans
  • Multigene Family / genetics
  • Phylogeny
  • Pseudogenes / genetics
  • Receptors, Odorant / classification
  • Receptors, Odorant / genetics*
  • Receptors, Odorant / physiology
  • Sequence Homology, Nucleic Acid

Substances

  • Receptors, Odorant