Comparative genomics of gene-family size in closely related bacteria

Genome Biol. 2004;5(4):R27. doi: 10.1186/gb-2004-5-4-r27. Epub 2004 Mar 18.


Background: The wealth of genomic data in bacteria is helping microbiologists understand the factors involved in gene innovation. Among these, the expansion and reduction of gene families appears to have a fundamental role in this, but the factors influencing gene family size are unclear.

Results: The relative content of paralogous genes in bacterial genomes increases with genome size, largely due to the expansion of gene family size in large genomes. Bacteria undergoing genome reduction display a parallel process of redundancy elimination, by which gene families are reduced to one or a few members. Gene family size is also influenced by sequence divergence and physiological function. Large gene families show wider sequence divergence, suggesting they are probably older, and certain functions (such as metabolite transport mechanisms) are overrepresented in large families. The size of a given gene family is remarkably similar in strains of the same species and in closely related species, suggesting that homologous gene families are vertically transmitted and depend little on horizontal gene transfer (HGT).

Conclusions: The remarkable preservation of copy numbers in widely different ecotypes indicates a functional role for the different copies rather than simply a back-up role. When different genera are compared, the increase in phylogenetic distance and/or ecological specialization disrupts this preservation, albeit in a gradual manner and maintaining an overall similarity, which also supports this view. HGT can have an important role, however, in nonhomologous gene families, as exemplified by a comparison between saprophytic and enterohemorrhagic strains of Escherichia coli.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chlamydophila pneumoniae / genetics
  • Chlamydophila pneumoniae / pathogenicity
  • Computational Biology
  • Escherichia coli / genetics*
  • Escherichia coli / pathogenicity
  • Escherichia coli O157 / genetics*
  • Escherichia coli O157 / pathogenicity
  • Evolution, Molecular
  • Gene Transfer, Horizontal / genetics
  • Genes, Bacterial / genetics
  • Genes, Bacterial / physiology
  • Genome, Bacterial*
  • Genomics / methods*
  • Multigene Family / genetics
  • Multigene Family / physiology
  • Staphylococcus aureus / genetics
  • Staphylococcus aureus / pathogenicity
  • Streptococcus pyogenes / genetics
  • Streptococcus pyogenes / pathogenicity