Estimating the size of the bacterial pan-genome

Trends Genet. 2009 Mar;25(3):107-10. doi: 10.1016/j.tig.2008.12.004. Epub 2009 Jan 23.


The 'pan-genome' denotes the set of all genes present in the genomes of a group of organisms. Here, we extend the pan-genome concept to higher taxonomic units. Using 573 sequenced genomes, we estimate the size of the bacterial pan-genome based on the frequency of occurrences of genes among sampled genomes. Using gene- and genome-centered approaches, we characterize three distinct pools of gene families that comprise the bacterial pan-genome, each evolving under different evolutionary constraints. Our findings indicate that the pan-genome of the bacterial domain is of infinite size (the Bacteria as a whole have an open pan-genome) and that approximately 250 genes per genome belong to the extended bacterial core genome.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Evolution, Molecular*
  • Gene Frequency
  • Genes, Bacterial
  • Genome, Bacterial*
  • Molecular Sequence Data
  • Sequence Homology, Amino Acid