Summary: Taxonomic analysis of microbial communities is well supported at the level of species and strains. However, species can contain significant phenotypic diversity and strains are rarely widely shared across global populations. Stratifying the diversity between species and strains can identify 'subspecies', which are a useful intermediary. High-throughput identification and profiling of subspecies is not yet supported in the microbiome field. Here, we use an operational definition of subspecies based on single nucleotide variant (SNV) patterns within species to identify and profile subspecies in metagenomes, along with their distinctive SNVs and genes. We incorporate this method into metaSNV v2, which extends existing SNV-calling software to support further SNV interpretation for population genetics. These new features support microbiome analyses to link SNV profiles with host phenotype or environment and niche-specificity. We demonstrate subspecies identification in marine and fecal metagenomes. In the latter, we analyze 70 species in 7524 adult and infant subjects, supporting a common subspecies population structure in the human gut microbiome and illustrating some limits in subspecies calling.
Availability and implementation: Source code, documentation, tutorials and test data are available at https://github.com/metasnv-tool/metaSNV and https://metasnv.embl.de.
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author(s) 2021. Published by Oxford University Press.