Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 117 (5), 2622-2633

Identifying Determinants of Bacterial Fitness in a Model of Human Gut Microbial Succession


Identifying Determinants of Bacterial Fitness in a Model of Human Gut Microbial Succession

Lihui Feng et al. Proc Natl Acad Sci U S A.


Human gut microbiota development has been associated with healthy growth but understanding the determinants of community assembly and composition is a formidable challenge. We cultured bacteria from serially collected fecal samples from a healthy infant; 34 sequenced strains containing 103,102 genes were divided into two consortia representing earlier and later stages in community assembly during the first six postnatal months. The two consortia were introduced alone (singly), or sequentially in different order, or simultaneously into young germ-free mice fed human infant formula. The pattern of fitness of bacterial strains observed across the different colonization conditions indicated that later-phase strains substantially outcompete earlier-phase strains, although four early-phase members persist. Persistence was not determined by order of introduction, suggesting that priority effects are not prominent in this model. To characterize succession in the context of the metabolic potential of consortium members, we performed in silico reconstructions of metabolic pathways involved in carbohydrate utilization and amino acid and B-vitamin biosynthesis, then quantified the fitness (abundance) of strains in serially collected fecal samples and their transcriptional responses to different histories of colonization. Applying feature-reduction methods disclosed a set of metabolic pathways whose presence and/or expression correlates with strain fitness and that enable early-stage colonizers to survive during introduction of later colonizers. The approach described can be used to test the magnitude of the contribution of identified metabolic pathways to fitness in different community contexts, study various ecological processes thought to govern community assembly, and facilitate development of microbiota-directed therapeutics.

Keywords: feature-reduction algorithms; gut microbiome; metabolic pathways; microbial community assembly/succession.

Conflict of interest statement

Competing interest statement: J.I.G. is a cofounder of Matatu, Inc., a company characterizing the role of diet-by-microbiota interactions in animal health.


Fig. 1.
Fig. 1.
Modeling microbial succession in gnotobiotic mice. Heat map of the fractional abundances of bacterial strains in the feces of gnotobiotic mice fed human IF and colonized with S1, S2, S1→S2, S2→S1, or S1+S2 consortia. Abundances were defined by shotgun sequencing of fecal DNAs prepared from samples collected on the indicated days post gavage (dpg). Average fractional abundances for S1 and S2 strains represented in the microbiota of mice belonging to each treatment group at each time point are shown (n = 5 mice per group; see Dataset S3A for values from individual animals).
Fig. 2.
Fig. 2.
Expressed metabolic pathways related to the fitness of S1 and S2 members in tandem colonization experiments. (A) Analysis of TPM-normalized microbial RNA-Seq datasets. A matrix of mcSEED metabolic pathways (columns) for all S1 and S2 organisms subject to the S1→S2, S2→S1, and S1+S2 colonization sequences (rows) is created where each element is the aggregated number of transcripts for a given pathway (92 pathways in total). A pseudocount of 0.1 is added to each element of the matrix. Each row is log-normalized against the reference row (P. distasonis in the S1→S2 colonization sequence) to create an mcSEED pathway/module relative expression profile for each organism in the climax community resulting from each of the three colonization treatments. See main text for further information about the terms used in the equation shown for relative expression score (REix). (B) mcSEED pathway/module relative expression profiles plotted in PCA space for the indicated colonization conditions using P. distasonis under the S1→S2 colonization condition as a reference. Red, blue, and black indicate the different colonization treatments. Strain names are colored green and black based on whether they are members of the S1 or S2 consortium, respectively. (C) Projection along PC1 in B is plotted against average fractional abundance for the indicated organisms on the last day of fecal sampling (dpg 28) for each of the three tandem colonization conditions. Names of organisms are color-coded as in B. Horizontal lines denote the SD of mean fractional abundance for the indicated organisms in a particular colonization condition. See Dataset S7 for details.
Fig. 3.
Fig. 3.
Using SVD to identify pathways distinguishing bacterial strains with different fitness characteristics. (A) The mathematical relationship between the correlation structure of strains and mcSEED pathways/modules is depicted. The relationship between S1 and S2 strains (n = 36) is given by the 36 × 36 correlation matrix Xij and between mcSEED pathway/modules (n = 18) by the 18 × 18 correlation matrix Fij. The equation for eigendecomposition of each correlation matrix is shown within the matrix. SVD relates the two correlation matrices by transforming the relative expression matrix (ME) into a product of three different matrices, U, V, and Σ1/2. U and V are matrices of the left and right singular vectors from the strain and mcSEED pathway/modules correlation matrices, respectively; they are related by the singular values contained within Σ1/2. (B) Histogram of the projection of mcSEED metabolic pathway/modules along the first right singular vector computed by SVD. (C) Heat map of mcSEED metabolic pathway/module relative expression relative to the reference condition highlighted in boldface (P. distasonis in the S1→S2 colonization condition) (Dataset S8B). Strains are hierarchically clustered according to the relative expression profile of the mcSEED metabolic pathways/modules that project within the 10th percentile of the histogram shown in B. Strain names are colored based on their membership in the S1 or S2 consortium as in Fig. 2B. (D) The source of the relative expression score for each organism/metabolic pathway pair from C is indicated by the coded key. “Reference pathway” refers to the mcSEED metabolic pathway/module in the reference organism, P. distasonis in S1→S2 colonization condition. “Test pathway” refers to the pathway/module of a test organism in the indicated colonization condition. See Dataset S9 for values associated with each symbol in the matrix. Symbol key: formula image, indicated metabolic pathway/module present and expressed in both test and reference organisms with expression in the test organism being statistically significantly different from the reference organism; formula image, pathway/module present and expressed in test and reference strains but not at statistically significantly different levels; formula image, present but not expressed in test, but present and expressed in reference strain; formula image, absent in test strain, but present and expressed in reference strain; formula image, present and expressed in test but absent in reference strain; formula image, present but not expressed in test strain and absent in reference strain; formula image, absent in both test and reference strains.
Fig. 4.
Fig. 4.
S1 members that survive after introduction of the S2 consortium retain their mcSEED metabolic pathway expression pattern. (A) mcSEED metabolic pathway/module relative expression profiles defined by the 18 mcSEED pathways/modules identified by SVD in Fig. 3B for each organism under all colonization conditions are plotted on a PCA space. The relative expression profile of E. faecium (S1 alone) is used as the reference. The space on the right of the panel has been rotated by 60° for ease of visualization. (BD) SVD performed on the PCA space in A yields projections of mcSEED metabolic pathways/modules onto right singular vectors 1 to 3. Histograms of these projections onto right singular vector 1 (B), vector 2 (C), and vector 3 (D) are shown. Projections within the positive and negative 10th percentiles for the first and second singular vectors are labeled and included in E. As the third right singular vector is negatively skewed, the projections within the negative 20th percentile are labeled and also included in E. (E) The mcSEED metabolic pathways/modules that define separation of species’ relative expression profiles along PC1, PC2, and PC3 are shown in heat-map form for all organisms in all colonization conditions. mcSEED metabolic pathway/module relative expression scores are shown as are a breakdown of the elements of that pathway/module’s score (i.e., pathway present/absent; whether or not expressed; whether expressed at levels significantly different from the pathway in the reference organism). See the legend to Fig. 3D for the key to the symbols used in each cell in the lower portion of the panel and Dataset S13 for the values associated with the symbols. (F) Box plots of levels of N-acetylgalactosamine and tagatose in cecal contents harvested from mice belonging to the indicated treatment group.

Similar articles

See all similar articles

Cited by 1 article


    1. Faust K., Raes J., Microbial interactions: From networks to models. Nat. Rev. Microbiol. 10, 538–550 (2012). - PubMed
    1. Layeghifard M., Hwang D. M., Guttman D. S., Disentangling interactions in the microbiome: A network perspective. Trends Microbiol. 25, 217–228 (2017). - PubMed
    1. Proulx S. R., Promislow D. E. L., Phillips P. C., Network thinking in ecology and evolution. Trends Ecol. Evol. (Amst.) 20, 345–353 (2005). - PubMed
    1. Stewart C. J., et al. , Temporal development of the gut microbiome in early childhood from the TEDDY study. Nature 562, 583–588 (2018). - PMC - PubMed
    1. Vatanen T., et al. , The human gut microbiome in early-onset type 1 diabetes from the TEDDY study. Nature 562, 589–594 (2018). - PMC - PubMed

LinkOut - more resources