Using Core Genome Alignments To Assign Bacterial Species

mSystems. 2018 Dec 4;3(6):e00236-18. doi: 10.1128/mSystems.00236-18. eCollection 2018 Nov-Dec.

Abstract

With the exponential increase in the number of bacterial taxa with genome sequence data, a new standardized method to assign species designations is needed that is consistent with classically obtained taxonomic analyses. This is particularly acute for unculturable, obligate intracellular bacteria with which classically defined methods, like DNA-DNA hybridization, cannot be used, such as those in the Rickettsiales. In this study, we generated nucleotide-based core genome alignments for a wide range of genera with classically defined species, as well as those within the Rickettsiales. We created a workflow that uses the length, sequence identity, and phylogenetic relationships inferred from core genome alignments to assign genus and species designations that recapitulate classically obtained results. Using this method, most classically defined bacterial genera have a core genome alignment that is ≥10% of the average input genome length. Both Anaplasma and Neorickettsia fail to meet this criterion, indicating that the taxonomy of these genera should be reexamined. Consistently, genomes from organisms with the same species epithet have ≥96.8% identity of their core genome alignments. Additionally, these core genome alignments can be used to generate phylogenomic trees to identify monophyletic clades that define species and neighbor-network trees to assess recombination across different taxa. By these criteria, Wolbachia organisms are delineated into species different from the currently used supergroup designations, while Rickettsia organisms are delineated into 9 distinct species, compared to the current 27 species. By using core genome alignments to assign taxonomic designations, we aim to provide a high-resolution, robust method to guide bacterial nomenclature that is aligned with classically obtained results. IMPORTANCE With the increasing availability of genome sequences, we sought to develop and apply a robust, portable, and high-resolution method for the assignment of genera and species designations that can recapitulate classically defined taxonomic designations. Using cutoffs derived from the lengths and sequence identities of core genome alignments along with phylogenetic analyses, we sought to evaluate or reevaluate genus- and species-level designations for diverse taxa, with an emphasis on the order Rickettsiales, where species designations have been applied inconsistently. Our results indicate that the Rickettsia genus has an overabundance of species designations, that the current Anaplasma and Neorickettsia genus designations are both too broad and need to be divided, and that there are clear demarcations of Wolbachia species that do not align precisely with the existing supergroup designations.

Keywords: Anaplasma; Rickettsia; Rickettsiales; Wolbachia; bacterial taxonomy; core genome alignment; genomics; species concept.