Integrating genomic and Tn-Seq data to identify common in vivo fitness mechanisms across multiple bacterial species

mBio. 2025 Nov 12;16(11):e0198825. doi: 10.1128/mbio.01988-25. Epub 2025 Sep 22.

Abstract

Sepsis, a life-threatening organ dysfunction, is due to an unregulated immune response to infection. Bacteremia is a leading cause of sepsis, and members of the Enterobacterales cause nearly half of bacteremia cases annually. Although previous Tn-Seq studies identified novel bacteremia-fitness genes, evidence for common pathways across species is lacking. To identify common fitness pathways in five bacteremia-causing Enterobacterales species, we utilized our pan-genome pipeline to integrate Tn-Seq fitness data with multiple available functional data types. Core genes from species pan-genomes were used to construct a multi-species core pan-genome, producing 2,850 core gene clusters found in four of five species. Integration of Tn-Seq fitness data identified 373 protein clusters conserved in all five species and a fitness gene in at least one of them. A scoring rubric was applied to these clusters, which incorporated Tn-Seq fitness defects, operon localization, and antibiotic susceptibility data, which reduced the number of bacteremia-fitness genes and identified seven common fitness mechanisms. Independent mutational validation of one prioritized fitness gene, tatC, showed reduced fitness in vivo across all species tested and increased susceptibility to β-lactams that was restored following tatC complementation in trans. By integrating known operon structures and antibiotic susceptibility with Tn-Seq fitness data, common genes within the core pan-genome emerged and revealed mechanisms essential for survival in the mammalian bloodstream. Our prediction and validation of tatC as a common bacteremia fitness factor supports the utility of this bioinformatic approach. This study represents a major step forward in prioritizing novel targets for therapy against sepsis infections.

Importance: Bacteremia is a leading cause of sepsis, a life-threatening condition where an unregulated immune response to infection causes systemic organ failure. Nearly half of bacteremia cases are caused by members of the Gram-negative bacterial taxonomic order Enterobacterales. Given the public health impact of bacteremia and the reduction of existing antibiotic treatment options, novel strategies are needed to combat these infections. In this study, pan-genome software was used to predict seven shared fitness pathways in these bacteria that may serve as novel targets for the treatment of bacteremia. Briefly, a scoring rubric was applied to shared pan-genome clusters, which incorporated multiple data types, including Tn-Seq fitness defects, operon localization, and antibiotic susceptibility data to rank and prioritize fitness genes. To validate one of our predictions, mutations were constructed in tatC, which showed both reduced fitness in mice in all species tested and increased susceptibility to β-lactam antibiotics; complementation restored fitness and antibiotic susceptibility to wild-type levels. This study takes a novel bioinformatics approach to build a core pan-genome across multiple distantly related bacteria to integrate computational and experimental data to predict important shared fitness genes and represents a major step forward toward identifying novel targets of therapy against these deadly, widespread, life-threatening infections.

Keywords: Enterobacterales; Tn-Seq; bacteremia; fitness genes; multi-species core genome.

MeSH terms

  • Animals
  • Anti-Bacterial Agents / pharmacology
  • Bacteremia / microbiology
  • Enterobacteriaceae Infections / microbiology
  • Enterobacteriaceae* / drug effects
  • Enterobacteriaceae* / genetics
  • Genetic Fitness*
  • Genome, Bacterial*
  • Genomics* / methods
  • Humans
  • Mice
  • Multigene Family

Substances

  • Anti-Bacterial Agents