Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 12, 66

Genomes in Turmoil: Quantification of Genome Dynamics in Prokaryote Supergenomes

Genomes in Turmoil: Quantification of Genome Dynamics in Prokaryote Supergenomes

Pere Puigbò et al. BMC Biol.

Abstract

Background: Genomes of bacteria and archaea (collectively, prokaryotes) appear to exist in incessant flux, expanding via horizontal gene transfer and gene duplication, and contracting via gene loss. However, the actual rates of genome dynamics and relative contributions of different types of event across the diversity of prokaryotes are largely unknown, as are the sizes of microbial supergenomes, i.e. pools of genes that are accessible to the given microbial species.

Results: We performed a comprehensive analysis of the genome dynamics in 35 groups (34 bacterial and one archaeal) of closely related microbial genomes using a phylogenetic birth-and-death maximum likelihood model to quantify the rates of gene family gain and loss, as well as expansion and reduction. The results show that loss of gene families dominates the evolution of prokaryotes, occurring at approximately three times the rate of gain. The rates of gene family expansion and reduction are typically seven and twenty times less than the gain and loss rates, respectively. Thus, the prevailing mode of evolution in bacteria and archaea is genome contraction, which is partially compensated by the gain of new gene families via horizontal gene transfer. However, the rates of gene family gain, loss, expansion and reduction vary within wide ranges, with the most stable genomes showing rates about 25 times lower than the most dynamic genomes. For many groups, the supergenome estimated from the fraction of repetitive gene family gains includes about tenfold more gene families than the typical genome in the group although some groups appear to have vast, 'open' supergenomes.

Conclusions: Reconstruction of evolution for groups of closely related bacteria and archaea reveals an extremely rapid and highly variable flux of genes in evolving microbial genomes, demonstrates that extensive gene loss and horizontal gene transfer leading to innovation are the two dominant evolutionary processes, and yields robust estimates of the supergenome size.

Figures

Figure 1
Figure 1
The clock of genome dynamics. The figure shows the correlation of branch lengths and number of (a) gains, (b) losses, (c) expansions and (d) reductions. It excludes singletons, i.e., gains in the terminal branches of the tree. Both x and y axes are have a logarithmic scale. All P < 0.0001. BL, branch length or number of nucleotide substitutions per site.
Figure 2
Figure 2
Distributions of the genome dynamics rates across the ATGCs. (a) Rates of gain, loss, expansion and reduction per nucleotide substitution per site. (b) Loss/gain and reduction/expansion ratios. (c) Gain/expansion and loss/reduction ratios. G/E, gain/expansion; L/G, loss/gain; L/R, loss/reduction; R/E, reduction/expansion.
Figure 3
Figure 3
Distribution of the gain, loss, expansion and reduction rates over the evolutionary tree of prokaryotes. The tree is from MicrobesOnline [62]. The areas of the circles are proportional to the rates of the respective events to a logarithmic scale. The numbers in parenthesis indicate the number of species in the ATGC. The ATGCs with episodes of rapid gene gain are denoted with *(<10% of branches) or **(>10% of branches). ATGC, alignable tight genome cluster.
Figure 4
Figure 4
Dependence of the rates of gains, losses, expansion and reductions on phylogenetic depth. (a) Gains, (b) losses, (c) expansions and (d) reductions per unit of branch length vs the phylogenetic depth. The figure excludes singletons, i.e., gains in the terminal branches of the tree are not represented. Both x and y axes have a logarithmic scale. The phylogenetic depth is measured in the number of nucleotide substitutions per site.
Figure 5
Figure 5
Dependence of the rates of gain, loss, expansion and reduction on bacterial taxonomy and lifestyle. (a) Rates of the four types of event for Actinobacteria, Firmicutes and Proteobacteria. (b) Rates of the four types of event for bacteria and archaea with three different lifestyles. FHA, facultative host-associated; FL, free-living; P, obligate intracellular parasite.
Figure 6
Figure 6
Correlations between the rates of gain, loss, expansion and reduction.
Figure 7
Figure 7
Principal component analysis of the rates of gains, losses, expansions and reductions. (a) XY-plot of the two first two principal components. (b) Principal component analysis loadings. Comp., component.
Figure 8
Figure 8
Correlation between gene flux and genome size. The horizontal axis shows the median number of genes in a genome in an ATGC. ATGC, alignable tight genome cluster; GDE, total gene flux (number of genome dynamics events per nucleotide substitution per site).
Figure 9
Figure 9
Genome flux by COG functional categories. (a) Flux. (b) Gain. (c) Loss. (d) Expansion. (e) Reduction. Designations of the functional categories (modified from [67]): C, energy production and conversion; D, cell division; E, amino acid metabolism and transport; F, nucleotide metabolism and transport; G, carbohydrate metabolism and transport; H, coenzyme metabolism; I, lipid metabolism; J, translation; K, transcription; L, replication and repair; M, membrane and cell wall structure and biogenesis; N, secretion and motility; O, post-translational modification, protein turnover and chaperone functions; P, inorganic ion transport and metabolism; Q, biosynthesis, transport and catabolism of secondary metabolites; R, general functional prediction only (typically, prediction of biochemical activity); S, function unknown; T, signal transduction; U, intracellular trafficking and secretion; V, defense systems; X, mobilome. COG, cluster of orthologous genes.
Figure 10
Figure 10
Comparison of genome, pangenome and estimated supergenome sizes. (a) Median genome vs supergenome size. (b) Density distribution of median genome, pangenome and supergenome size.
Figure 11
Figure 11
Distribution of the median genome, pangenome and estimated supergenome sizes over the evolutionary tree of prokaryotes. The tree is from MicrobesOnline [73]. Areas of the circles are proportional to the number of genes in the respective genomes (median), pangenome, a006Ed supergenome. FHA, facultative host-associated; FL, free-living; O, open supergenome; P, obligate intracellular parasite.

Similar articles

See all similar articles

Cited by 56 articles

See all "Cited by" articles

References

    1. Kolsto AB. Dynamic bacterial genome organization. Mol Microbiol. 1997;24:241–248. doi: 10.1046/j.1365-2958.1997.3501715.x. - DOI - PubMed
    1. Koonin EV, Galperin MY. Prokaryotic genomes: the emerging paradigm of genome-based microbiology. Curr Opin Genet Dev. 1997;7:757–763. doi: 10.1016/S0959-437X(97)80037-8. - DOI - PubMed
    1. Casjens S. The diverse and dynamic structure of bacterial genomes. Annu Rev Genet. 1998;32:339–377. doi: 10.1146/annurev.genet.32.1.339. - DOI - PubMed
    1. Bellgard MI, Itoh T, Watanabe H, Imanishi T, Gojobori T. Dynamic evolution of genomes and the concept of genome space. Ann N Y Acad Sci. 1999;870:293–300. doi: 10.1111/j.1749-6632.1999.tb08891.x. - DOI - PubMed
    1. Doolittle WF. Lateral genomics. Trends Cell Biol. 1999;9:M5–M8. doi: 10.1016/S0962-8924(99)01664-5. - DOI - PubMed

Publication types

LinkOut - more resources

Feedback