The genus Mycobacterium contains 188 species including several major human pathogens as well as numerous other environmental species. We report here comprehensive phylogenomics and comparative genomic analyses on 150 genomes of Mycobacterium species to understand their interrelationships. Phylogenetic trees were constructed for the 150 species based on 1941 core proteins for the genus Mycobacterium, 136 core proteins for the phylum Actinobacteria and 8 other conserved proteins. Additionally, the overall genome similarity amongst the Mycobacterium species was determined based on average amino acid identity of the conserved protein families. The results from these analyses consistently support the existence of five distinct monophyletic groups within the genus Mycobacterium at the highest level, which are designated as the "Tuberculosis-Simiae," "Terrae," "Triviale," "Fortuitum-Vaccae," and "Abscessus-Chelonae" clades. Some of these clades have also been observed in earlier phylogenetic studies. Of these clades, the "Abscessus-Chelonae" clade forms the deepest branching lineage and does not form a monophyletic grouping with the "Fortuitum-Vaccae" clade of fast-growing species. In parallel, our comparative analyses of proteins from mycobacterial genomes have identified 172 molecular signatures in the form of conserved signature indels and conserved signature proteins, which are uniquely shared by either all Mycobacterium species or by members of the five identified clades. The identified molecular signatures (or synapomorphies) provide strong independent evidence for the monophyly of the genus Mycobacterium and the five described clades and they provide reliable means for the demarcation of these clades and for their diagnostics. Based on the results of our comprehensive phylogenomic analyses and numerous identified molecular signatures, which consistently and strongly support the division of known mycobacterial species into the five described clades, we propose here division of the genus Mycobacterium into an emended genus Mycobacterium encompassing the "Tuberculosis-Simiae" clade, which includes all of the major human pathogens, and four novel genera viz. Mycolicibacterium gen. nov., Mycolicibacter gen. nov., Mycolicibacillus gen. nov. and Mycobacteroides gen. nov. corresponding to the "Fortuitum-Vaccae," "Terrae," "Triviale," and "Abscessus-Chelonae" clades, respectively. With the division of mycobacterial species into these five distinct groups, attention can now be focused on unique genetic and molecular characteristics that differentiate members of these groups.
Keywords: Mycobacterium classification; abscessus-chelonae clade; conserved signature indels and signature proteins; fortuitum-vaccae clade; phylogenomic analysis; slow-growing and fast-growing mycobacteria; terrae clade; triviale clade.