Molecular Epidemiology of Mycobacterium tuberculosis Across 3 Distinct Geographic Sites in South Africa

J Infect Dis. 2025 Oct 15;232(4):870-881. doi: 10.1093/infdis/jiaf326.

Abstract

Background: Whole genome sequence data can generate insights about Mycobacterium tuberculosis (Mtb) transmission. We used whole genome sequencing and linked epidemiology data from a recent randomized trial to characterize Mtb relatedness across 3 geographically distinct South African sites.

Methods: We sequenced culture isolates from participants with culture-positive tuberculosis in the Kharituwe study, which evaluated household contact investigation strategies in 1 urban and 2 rural sites. We adapted a previous bioinformatic pipeline to clean, extract, and filter Mtb reads; perform reference alignment; calculate single-nucleotide polymorphism (SNP) distances between isolates; and group isolates into clusters linked by recent transmission based on 3 SNP-based cutoffs. Sequence data were linked to individual data on demographics and risk factors. We analyzed clustering across and within study sites and used log-binomial regression to assess characteristics associated with clustering.

Results: At a cutoff of 12 SNPs, 213 of 714 sequenced isolates passing quality control filters were clustered. While only 3 of 45 pairs included participants from different sites, the majority of clusters with ≥4 participants included representation from at least 2 sites. Expanding to a 20-SNP cutoff revealed a large cluster containing 10% of isolates, with urban/rural representation mirroring that of all the isolates (61% urban, 39% rural). Participants from the urban site, TB household contacts, and participants reporting a history of incarceration were more likely to be in a cluster.

Conclusions: Observed clustering and strain diversity across sites indicate the presence of multiple ongoing and geographically dispersed outbreaks in this setting.

Keywords: South Africa; genomics; spatial; transmission; tuberculosis.

MeSH terms

  • Adolescent
  • Adult
  • Cluster Analysis
  • Female
  • Genome, Bacterial
  • Humans
  • Male
  • Middle Aged
  • Molecular Epidemiology
  • Mycobacterium tuberculosis* / classification
  • Mycobacterium tuberculosis* / genetics
  • Mycobacterium tuberculosis* / isolation & purification
  • Polymorphism, Single Nucleotide
  • Risk Factors
  • Rural Population
  • South Africa / epidemiology
  • Tuberculosis* / epidemiology
  • Tuberculosis* / microbiology
  • Tuberculosis* / transmission
  • Whole Genome Sequencing
  • Young Adult