Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Oct;28(10):1467-1480.
doi: 10.1101/gr.236000.118. Epub 2018 Sep 19.

Metagenomic Analysis With Strain-Level Resolution Reveals Fine-Scale Variation in the Human Pregnancy Microbiome

Affiliations
Free PMC article

Metagenomic Analysis With Strain-Level Resolution Reveals Fine-Scale Variation in the Human Pregnancy Microbiome

Daniela S Aliaga Goltsman et al. Genome Res. .
Free PMC article

Abstract

Recent studies suggest that the microbiome has an impact on gestational health and outcome. However, characterization of the pregnancy-associated microbiome has largely relied on 16S rRNA gene amplicon-based surveys. Here, we describe an assembly-driven, metagenomics-based, longitudinal study of the vaginal, gut, and oral microbiomes in 292 samples from 10 subjects sampled every three weeks throughout pregnancy. Nonhuman sequences in the amount of 1.53 Gb were assembled into scaffolds, and functional genes were predicted for gene- and pathway-based analyses. Vaginal assemblies were binned into 97 draft quality genomes. Redundancy analysis (RDA) of microbial community composition at all three body sites revealed gestational age to be a significant source of variation in patterns of gene abundance. In addition, health complications were associated with variation in community functional gene composition in the mouth and gut. The diversity of Lactobacillus iners-dominated communities in the vagina, unlike most other vaginal community types, significantly increased with gestational age. The genomes of co-occurring Gardnerella vaginalis strains with predicted distinct functions were recovered in samples from two subjects. In seven subjects, gut samples contained strains of the same Lactobacillus species that dominated the vaginal community of that same subject and not other Lactobacillus species; however, these within-host strains were divergent. CRISPR spacer analysis suggested shared phage and plasmid populations across body sites and individuals. This work underscores the dynamic behavior of the microbiome during pregnancy and suggests the potential importance of understanding the sources of this behavior for fetal development and gestational outcome.

Figures

Figure 1.
Figure 1.
Community structure of the pregnancy microbiome over time in 10 subjects. (A) Relative abundance of vaginal genome bins (y-axis). Abundance was estimated from the number of reads that mapped to each bin and normalized by the length of the bin. The top 10 most abundant vaginal taxa are displayed (see key at bottom). (Unbinned) Sequences that were not assigned to a classified genome bin. (B,C). Relative abundance (y-axis) of the top 50 most abundant taxa across all subjects in saliva (B) and gut (C) samples, respectively. Gut samples from subject Pre3 were not available. Species abundance was estimated from the average read counts of single-copy ribosomal protein (RP) sets (at least one of 16), summed over scaffolds sharing RPs clustered at 99% amino acid identity. Each taxon is represented by a distinct color (see key for selected taxa at bottom) and is classified at the most resolved level possible. Co-occurring strains of Rothia mucilaginosa, Streptococcus sanguinis, and Streptococcus parasanguinis in B, and of Bacteroides vulgatus and Akkermansia spp. CAG:344 in C are highlighted with black boxes.
Figure 2.
Figure 2.
Sources of variation in abundance of UniRef90 gene families across all subjects and samples. Top: (A,B,D,F) Nonmetric multidimensional scaling (NMDS) plots from Bray-Curtis distance matrices of variance-stabilized gene family abundances. The stress, “s” (the amount of variability unexplained by the NMDS ordination), is shown on each plot. Bottom: (C,E,G) Redundancy analysis (RDA) plots from variance-stabilized gene family abundances. The P-values for the RDA plots were estimated with the anova.cca function from the vegan package in R. (A,B) NMDS split plot from vaginal samples: The “samples” plot (A) was color-coded based on subject, whereas the “genes” plot (B) was color-coded based on the taxonomic classification of genes (gray dots: genes belonging to other taxa). (C) RDA plot from vaginal samples, constrained by gestational age (GA) in weeks and by the most abundant taxon in each sample. Samples are color-coded based on the most abundant taxon. (D,E) NMDS and RDA plots from gut samples. (F,G) NMDS and RDA plots from saliva samples. Gestational age and health complication were used to constrain the RDA analysis in E and G. Complication: uncomplicated (five subjects); preeclampsia (three subjects); other, i.e., type 2 diabetes (one subject); and oligohydramnios (one subject).
Figure 3.
Figure 3.
Gestational age trends for abundances of gene families in saliva samples for each subject. Gestational age was used to constrain the redundancy analysis of variance-stabilized gene family abundances within individuals. Gestational age effect is observed along the x-axis (RDA1 axis), and points within plots were connected based on the resulting ordination scores. P-values were calculated with ANOVA on the RDA ordination constraint using the anova.cca function of the vegan package in R.
Figure 4.
Figure 4.
Gardnerella vaginalis genome analysis. 16S rRNA gene abundance of G. vaginalis strains recovered from each of two subjects, Pre2 (A) and Term4 (B). G. vaginalis genotypic groups, C1 and C2, are colored according to the classification in C. Relative abundance (left y-axis). Estimated iRep values are plotted for G. vaginalis strains in subject Pre2 (right y-axis). (C) Phylogenomic tree of 40 G. vaginalis strains genomes, including 34 available in GenBank. The six genomes recovered in the current study are shown in bold. Colored bars represent genotypic groups within the G. vaginalis phylogeny, where colors for clades 1–4 match the G. vaginalis genotypic groups defined by Ahmed et al. (2012). FastTree branch support for the most visible nodes is shown. (D) Radial representation of the same phylogenomic tree displayed in C, where leaves are colored based on 16S rRNA V4 sequence variant classification defined by Callahan et al. (2017). Genomes for which a full-length 16S rRNA sequence or V4 sequence were not available are shown in black.
Figure 5.
Figure 5.
Comparative genomics of Lactobacillus spp. and estimated replication rates. (A) Multiple genome alignment of five L. iners genomes recovered through metagenomics along with the reference strain DSM 13335 (see Methods). A modified alignment created in Mauve shows the shared genomic context, as well as genomic islands unique to a strain (white areas within a conserved block). (B,C) Relative abundance and estimated genome replication values (iRep) for L. iners and L. crispatus, respectively.
Figure 6.
Figure 6.
Diversity and distribution of CRISPR-Cas systems and CRISPR spacer targets (mobile elements). (A) Distribution of CRISPR-Cas system types in the gut (gray) and saliva (black) samples in this study (I, II, III, V). Vaginal samples are not shown due to low bacterial diversity. The black lines indicate standard deviation. (B) Frequency of matches between spacers and scaffolds. This graph shows the total number of matches by body site (outlined gray, vaginal spacers; light gray, saliva spacers; and dark gray, gut spacers). There were a total of 78,054 matches between spacers and all pregnancy scaffolds. Overall, we detected 3254 spacer types from vaginal samples, 36,477 spacer types from gut samples, and 36,279 spacer types from saliva samples.

Similar articles

See all similar articles

Cited by 12 articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback