Background: The Bacillus cereus sensu lato group contains ubiquitous facultative anaerobic soil-borne Gram-positive spore-forming bacilli. Molecular phylogeny and comparative genome sequencing have suggested that these organisms should be classified as a single species. While clonal in nature, there do not appear to be species-specific clonal lineages, excepting B. anthracis, in spite of the wide array of phenotypes displayed by these organisms.
Results: We compared the protein-coding content of 201 B. cereus sensu lato genomes to characterize differences and understand the consequences of these differences on biological function. From this larger group we selected a subset consisting of 25 whole genomes for deeper analysis. Cluster analysis of orthologous proteins grouped these genomes into five distinct clades. Each clade could be characterized by unique genes shared among the group, with consequences for the phenotype of each clade. Surprisingly, this population structure recapitulates our recent observations on the divergence of the generalized stress response (SigB) regulons in these organisms. Divergence of the SigB regulon among these organisms is primarily due to the placement of SigB-dependent promoters that bring genes from a common gene pool into/out of the SigB regulon.
Conclusions: Collectively, our observations suggest the hypothesis that the evolution of these closely related bacteria is a consequence of two distinct processes. Horizontal gene transfer, gene duplication/divergence and deletion dictate the underlying coding capacity in these genomes. Regulatory divergence overlays this protein coding reservoir and shapes the expression of both the unique and shared coding capacity of these organisms, resulting in phenotypic divergence. Data from other organisms suggests that this is likely a common pattern in prokaryotic evolution.