A taxonomy of bacterial microcompartment loci constructed by a novel scoring method

PLoS Comput Biol. 2014 Oct 23;10(10):e1003898. doi: 10.1371/journal.pcbi.1003898. eCollection 2014 Oct.


Bacterial microcompartments (BMCs) are proteinaceous organelles involved in both autotrophic and heterotrophic metabolism. All BMCs share homologous shell proteins but differ in their complement of enzymes; these are typically encoded adjacent to shell protein genes in genetic loci, or operons. To enable the identification and prediction of functional (sub)types of BMCs, we developed LoClass, an algorithm that finds putative BMC loci and inventories, weights, and compares their constituent pfam domains to construct a locus similarity network and predict locus (sub)types. In addition to using LoClass to analyze sequences in the Non-redundant Protein Database, we compared predicted BMC loci found in seven candidate bacterial phyla (six from single-cell genomic studies) to the LoClass taxonomy. Together, these analyses resulted in the identification of 23 different types of BMCs encoded in 30 distinct locus (sub)types found in 23 bacterial phyla. These include the two carboxysome types and a divergent set of metabolosomes, BMCs that share a common catalytic core and process distinct substrates via specific signature enzymes. Furthermore, many Candidate BMCs were found that lack one or more core metabolosome components, including one that is predicted to represent an entirely new paradigm for BMC-associated metabolism, joining the carboxysome and metabolosome. By placing these results in a phylogenetic context, we provide a framework for understanding the horizontal transfer of these loci, a starting point for studies aimed at understanding the evolution of BMCs. This comprehensive taxonomy of BMC loci, based on their constituent protein domains, foregrounds the functional diversity of BMCs and provides a reference for interpreting the role of BMC gene clusters encoded in isolate, single cell, and metagenomic data. Many loci encode ancillary functions such as transporters or genes for cofactor assembly; this expanded vocabulary of BMC-related functions should be useful for design of genetic modules for introducing BMCs in bioengineering applications.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Aldehyde Dehydrogenase
  • Algorithms
  • Bacteria* / genetics
  • Bacteria* / metabolism
  • Bacterial Proteins* / genetics
  • Bacterial Proteins* / metabolism
  • Bioengineering
  • Computational Biology / methods*
  • Gene Regulatory Networks* / genetics
  • Gene Regulatory Networks* / physiology
  • Organelles
  • Phylogeny


  • Bacterial Proteins
  • Aldehyde Dehydrogenase

Grant support

This work was supported by the National Science Foundation (EF1105892 and MCB1160614 to CAK) and the US Department of Energy contract no. DE-AC02 05CH11231 (to CAK). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.