Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct 20;12(1):6099.
doi: 10.1038/s41467-021-26248-1.

Population structure, biogeography and transmissibility of Mycobacterium tuberculosis

Affiliations

Population structure, biogeography and transmissibility of Mycobacterium tuberculosis

Luca Freschi et al. Nat Commun. .

Abstract

Mycobacterium tuberculosis is a clonal pathogen proposed to have co-evolved with its human host for millennia, yet our understanding of its genomic diversity and biogeography remains incomplete. Here we use a combination of phylogenetics and dimensionality reduction to reevaluate the population structure of M. tuberculosis, providing an in-depth analysis of the ancient Indo-Oceanic Lineage 1 and the modern Central Asian Lineage 3, and expanding our understanding of Lineages 2 and 4. We assess sub-lineages using genomic sequences from 4939 pan-susceptible strains, and find 30 new genetically distinct clades that we validate in a dataset of 4645 independent isolates. We find a consistent geographically restricted or unrestricted pattern for 20 groups, including three groups of Lineage 1. The distribution of terminal branch lengths across the M. tuberculosis phylogeny supports the hypothesis of a higher transmissibility of Lineages 2 and 4, in comparison with Lineages 3 and 1, on a global scale. We define an expanded barcode of 95 single nucleotide substitutions that allows rapid identification of 69 M. tuberculosis sub-lineages and 26 additional internal groups. Our results paint a higher resolution picture of the M. tuberculosis phylogeny and biogeography.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Phylogenetic tree reconstruction of lineage 1 (binary tree).
Gray circles define splits where the FST (fixation index) calculated using the descendants of the two children nodes is greater than 0.33. The sub-lineages are defined by colored areas (blue: sub-lineages already described in the literature; green: sub-lineages described here; purple: internal sub-lineages). Source data are provided as a Source Data file.
Fig. 2
Fig. 2. Phylogenetic tree reconstruction of lineage 3 (binary tree).
Gray circles define splits where the FST (fixation index) calculated using the descendants of the two children nodes is greater than 0.33. The sub-lineages are defined by colored areas (green: sub-lineages described here; purple: internal sub-lineages). Source data are provided as a Source Data file.
Fig. 3
Fig. 3. Phylogenetic tree reconstruction of lineage 2 (binary tree).
Gray circles define splits where the FST (fixation index) calculated using the descendants of the two children nodes is greater than 0.33. The sub-lineages are defined by colored areas (blue: sub-lineages already described in the literature; green: sub-lineages described here; purple: internal sub-lineages). Source data are provided as a Source Data file.
Fig. 4
Fig. 4. Phylogenetic tree reconstruction of lineage 4 (binary tree).
Gray circles define splits where the FST (fixation index) calculated using the descendants of the two children nodes is greater than 0.33. The sub-lineages are defined by colored areas (blue: sub-lineages already described in the literature; green: sub-lineages described here; purple: internal sub-lineages). Source data are provided as a Source Data file.
Fig. 5
Fig. 5. Histogram of the Simpson diversity index calculated for sub-lineages of lineages 1–4.
A data set of 17,432 isolates from 74 countries was used to perform this analysis. Yellow triangles designate the Simpson diversity index values of sub-lineages designated as geographically restricted by Stucki et al. Light gray circles designate the Simpson diversity index values of sub-lineages designated as geographically unrestricted by Stucki et al. Source data are provided as a Source Data file.
Fig. 6
Fig. 6. Geographic distribution of internal sub-lineage 1.1.3.i1.
Colors represent the percentage of 1.1.3.i1 strains isolated in a given country with respect to all lineage 1 strains isolated in such country. Source data are provided as a Source Data file.
Fig. 7
Fig. 7. Geographic distribution of internal sub-lineage 1.1.1.1.
Colors represent the percentage of 1.1.1.1 strains isolated in a given country with respect to all lineage 1 strains isolated in such country. Source data are provided as a Source Data file.
Fig. 8
Fig. 8. Geographic distribution of internal sub-lineage 1.1.2.
Colors represent the percentage of 1.1.2 strains isolated in a given country with respect to all lineage 1 strains isolated in such country. Source data are provided as a Source Data file.
Fig. 9
Fig. 9. Distributions of terminal branch lengths for the four global Mtb lineages (L1–L4).
Two-sided Wilcoxon rank sum tests were performed to test that two distributions were significantly different. Medians: 6.2 × 10−5 (L4), 8.2 × 10−5 (L2), 10.2 × 10−5 (L3), 17.5 × 10−5 (L1). Comparisons: L1 vs L2, L3 or L4 (p-value < 2.2 × 10−16); L2 vs L3 (p-value = 3.6 × 10−6), L2 vs L4 (p-value < 2.2 × 10−16); L3 vs L4 (p-value < 2.2 × 10−16). Description of the distributions (L1: n = 739, Min: 0.5 × 10−5, 1st Quartile: 6.7 × 10−5, Median: 17.5 × 10−5, 3rd Quartile: 28 × 10−5, Max: 120 × 10−5; L2: n = 2193, Min: 0.7 × 10−5, 1st Quartile: 5.3 × 10−5, Median: 8.2 × 10−5, 3rd Quartile: 12 × 10−5, Max: 110 × 10−5; L3: n = 1103, Min: 0.5 × 10−5, 1st Quartile: 4.5 × 10−5, Median: 10.2 × 10−5, 3rd Quartile: 20 × 10−5, Max: 80 × 10−5; L4: n = 5514, Min: 0.2 × 10−5, 1st Quartile: 2.6 × 10−5, Median: 6.2 × 10−5, 3rd Quartile: 13 × 10−5, Max: 70 × 10−5). Source data are provided as a Source Data file.

Similar articles

Cited by

References

    1. Gagneux S, Small PM. Global phylogeography of Mycobacterium tuberculosis and implications for tuberculosis product development. Lancet Infect. Dis. 2007;7:328–337. doi: 10.1016/S1473-3099(07)70108-1. - DOI - PubMed
    1. Sreevatsan S, et al. Restricted structural gene polymorphism in the Mycobacterium tuberculosis complex indicates evolutionarily recent global dissemination. Proc. Natl Acad. Sci. USA. 1997;94:9869–9874. doi: 10.1073/pnas.94.18.9869. - DOI - PMC - PubMed
    1. Gagneux S, et al. Variable host–pathogen compatibility in Mycobacterium tuberculosis. Proc. Natl Acad. Sci. USA. 2006;103:2869–2873. doi: 10.1073/pnas.0511240103. - DOI - PMC - PubMed
    1. Brudey K, et al. Mycobacterium tuberculosis complex genetic diversity: mining the fourth international spoligotyping database (SpolDB4) for classification, population genetics and epidemiology. BMC Microbiol. 2006;6:23. doi: 10.1186/1471-2180-6-23. - DOI - PMC - PubMed
    1. Baker L, Brown T, Maiden MC, Drobniewski F. Silent nucleotide polymorphisms and a phylogeny for Mycobacterium tuberculosis. Emerg. Infect. Dis. 2004;10:1568–1577. doi: 10.3201/eid1009.040046. - DOI - PMC - PubMed