Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;7(6):e32118.
doi: 10.1371/journal.pone.0032118. Epub 2012 Jun 13.

Analyses of the Microbial Diversity Across the Human Microbiome

Free PMC article

Analyses of the Microbial Diversity Across the Human Microbiome

Kelvin Li et al. PLoS One. .
Free PMC article


Analysis of human body microbial diversity is fundamental to understanding community structure, biology and ecology. The National Institutes of Health Human Microbiome Project (HMP) has provided an unprecedented opportunity to examine microbial diversity within and across body habitats and individuals through pyrosequencing-based profiling of 16 S rRNA gene sequences (16 S) from habits of the oral, skin, distal gut, and vaginal body regions from over 200 healthy individuals enabling the application of statistical techniques. In this study, two approaches were applied to elucidate the nature and extent of human microbiome diversity. First, bootstrap and parametric curve fitting techniques were evaluated to estimate the maximum number of unique taxa, S(max), and taxa discovery rate for habitats across individuals. Next, our results demonstrated that the variation of diversity within low abundant taxa across habitats and individuals was not sufficiently quantified with standard ecological diversity indices. This impact from low abundant taxa motivated us to introduce a novel rank-based diversity measure, the Tail statistic, ("τ"), based on the standard deviation of the rank abundance curve if made symmetric by reflection around the most abundant taxon. Due to τ's greater sensitivity to low abundant taxa, its application to diversity estimation of taxonomic units using taxonomic dependent and independent methods revealed a greater range of values recovered between individuals versus body habitats, and different patterns of diversity within habitats. The greatest range of τ values within and across individuals was found in stool, which also exhibited the most undiscovered taxa. Oral and skin habitats revealed variable diversity patterns, while vaginal habitats were consistently the least diverse. Collectively, these results demonstrate the importance, and motivate the introduction, of several visualization and analysis methods tuned specifically for next-generation sequence data, further revealing that low abundant taxa serve as an important reservoir of genetic diversity in the human microbiome.

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.


Figure 1
Figure 1. Contribution of taxa to diversity.
A theoretical rank abundance curve (a PDF) is overlayed with its CDF (black) as a “Pareto chart”. The overlaid colored lines represent each diversity index as lower abundant taxonomic units are included. For example at “c”, the height of each curve represents the relative value of the index if the sample were only composed of a, b, and c. The more quickly an index curve reaches it maximum normalized value of 1.0, the less the index is capable of resolving low abundance taxonomic units. From the graph, it can be observed that the Shannon and Simpson diversity indices approach their saturation point more quickly than the Tail statistic or a Renyi entropy with a fractional alpha.
Figure 2
Figure 2. Body habitats ordered by diversity measure.
Body regions are color coded, Oral-black, Skin-red, Vaginal-green, and Stool-blue. Subfigures a, b, and c, were computed on genera-based taxonomic units. Subfigures d, e, and f, were computed on OTU-based taxonomic units.
Figure 3
Figure 3. Comparison of diversity indices for median versus pooled taxonomic profiles.
Simple regression lines were drawn in solid black for each median individual versus pooled samples scatterplot. The dashed blue lines (slope  = 1, y-intercept  = 0) represent where a hypothetical (median  =  pooled) relationship would exist if all individuals had identical taxonomic profiles. Both the OTU-based and genera-based comparisons using the Shannon diversity index indicate only a slight and almost constant elevation of the diversity between the median individual and pooled samples. However, τ is able to capture the lengthening tail attributed to the low abundant taxa that are exclusive to certain individuals. See Table S2 for a mapping of abbreviations to habitat names. Green, red, black and blue points represent vaginal, skin, oral, and stool body regions, respectively.
Figure 4
Figure 4. Low abundance, high ubiquity taxa.
This figure helps the observer to comprehend the relationship between abundance and ubiquity when defining a core microbiome. As one would expect, increasing the abundance threshold for defining whether a sample contains a particular taxon would reduce the percentage of samples (ubiquity) that would contain it. The lines that are presented refer to all taxa in the stool samples that are in more than 97.5% of the samples with an abundance cutoff of 0.05%. The taxon Bacteroides (red) is both relatively highly abundant and highly ubiquitous, so its fall off is less steep than the Clostridales shown.
Figure 5
Figure 5. OTU-to-Genera ratios.
The median ratio of OTUs to Genera was calculated and plotted from greatest to least for each body habitat. These medians and 95% confidence intervals were estimated with bootstrapping by resampling from the combined distribution of OTUs and Genera to a common read depth. The common read depth chosen was the body habitat with the least read coverage, left antecubital fossa.
Figure 6
Figure 6. Comparison of all and “common” taxonomic units and their effect on the Shannon and τ statistics.
For both genera-based and OTU-based taxonomic units, the Shannon diversity index and τ were compared against the median estimated Smax on all (blue) and common (green) taxonomic units. Each point in the scatterplot represents one of the 18 body habitats. There is a closer relationship between τ and Smax than for the Shannon diversity index, for both genera and OTU based profiles. The red line represents a simple regression line across all points.
Figure 7
Figure 7. Dominance Profiles.
These stacked bar plots help to compare the low abundant taxonomic units, which may be difficult to visualize with rank abundance curves alone. The number of taxonomic units for each body region is represented by the height of each bar plot. The proportions that are colored represent the relative logarithm of abundance with the color key on the left. The subpanels, a and b, represent genera and OTUs, respectively.
Figure 8
Figure 8. Relationship between the Tail statistic, τ, and Standard Deviation, σ.
τ is the standard deviation of the rank abundance curve after reflection around the most dominant taxonomic unit, i  = 1. The blue bars represent the rank abundance curve. Above each bar, the probability, Pr[i], of the i th most dominant taxonomic unit has been labelled in italics. The natural numbers labelled in bold above the blue bars represent the rank, i, of each taxonomic unit. The name of each taxonomic unit is labelled along the x-axis. The grey bars represent the mirror image of the rank abundance curve. Treating i  = 1of the symmetric distribution as μ  = 1, the standard deviation, σ, is then 3.764, which also represents τ for this rank abundance curve and sample.

Similar articles

See all similar articles

Cited by 84 articles

See all "Cited by" articles


    1. Goodman AL, Gordon JI. Our unindicted coconspirators: human metabolism from a microbial perspective. Cell Metab. 2010;12(2):111–6. - PMC - PubMed
    1. Bäckhed F, Ding H, Wang T, Hooper LV, Koh GY, et al. The gut microbiota as an environmental factor that regulates fat storage. Proc Natl Acad Sci U S A. 2004;101(44):15718–23. - PMC - PubMed
    1. Ordovas JM, Mooser V. Metagenomics: the role of the microbiome in cardiovascular diseases. Curr Opin Lipidol. 2006;17(2):157–61. - PubMed
    1. Belda-Ferre P, Alcaraz LD, Cabrera-Rubio R, Romero H, Simón-Soro A, et al. The oral metagenome in health and disease. ISME J. 2011. doi:10.1038/ismej.2011.85.
    1. Kau AL, Ahern PP, Griffin NW, Goodman AL, Gordon JI. Human nutrition, the gut microbiome and the immune system. Nature. 2011;474(7351):327–36. doi:10.1038/nature10213. - PMC - PubMed

Publication types