Longitudinal profiling of low-abundance strains in microbiomes with ChronoStrain

Nat Microbiol. 2025 May;10(5):1184-1197. doi: 10.1038/s41564-025-01983-z. Epub 2025 May 6.

Abstract

The ability to detect and quantify microbiota over time from shotgun metagenomic data has a plethora of clinical, basic science and public health applications. Given these applications, and the observation that pathogens and other taxa of interest can reside at low relative abundance, there is a critical need for algorithms that accurately profile low-abundance microbial taxa with strain-level resolution. Here we present ChronoStrain: a sequence quality- and time-aware Bayesian model for profiling strains in longitudinal samples. ChronoStrain explicitly models the presence or absence of each strain and produces a probability distribution over abundance trajectories for each strain. Using synthetic and semi-synthetic data, we demonstrate how ChronoStrain outperforms existing methods in abundance estimation and presence/absence prediction. Applying ChronoStrain to two human microbiome datasets demonstrated its improved interpretability for profiling Escherichia coli strain blooms in longitudinal faecal samples from adult women with recurring urinary tract infections, and its improved accuracy for detecting Enterococcus faecalis strains in infant faecal samples. Compared with state-of-the-art methods, ChronoStrain's ability to detect low-abundance taxa is particularly stark.

MeSH terms

  • Adult
  • Algorithms
  • Bacteria* / classification
  • Bacteria* / genetics
  • Bacteria* / isolation & purification
  • Bayes Theorem
  • Enterococcus faecalis / classification
  • Enterococcus faecalis / genetics
  • Enterococcus faecalis / isolation & purification
  • Escherichia coli / classification
  • Escherichia coli / genetics
  • Escherichia coli / isolation & purification
  • Feces / microbiology
  • Female
  • Humans
  • Infant
  • Longitudinal Studies
  • Metagenomics* / methods
  • Microbiota* / genetics