Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2015 Jan;64(1):e26-41.
doi: 10.1093/sysbio/syu053. Epub 2014 Aug 7.

Phylogenetics and the human microbiome

Affiliations
Review

Phylogenetics and the human microbiome

Frederick A Matsen 4th. Syst Biol. 2015 Jan.

Abstract

The human microbiome is the ensemble of genes in the microbes that live inside and on the surface of humans. Because microbial sequencing information is now much easier to come by than phenotypic information, there has been an explosion of sequencing and genetic analysis of microbiome samples. Much of the analytical work for these sequences involves phylogenetics, at least indirectly, but methodology has developed in a somewhat different direction than for other applications of phylogenetics. In this article, I review the field and its methods from the perspective of a phylogeneticist, as well as describing current challenges for phylogenetics coming from this type of work.

Keywords: 16S; human microbiome; human microbiota; metagenome; microbial ecology; phylogenetic methods.

PubMed Disclaimer

Figures

F<sc>igure</sc> 1.
Figure 1.
Unweighted phylogenetic diversity (PD, left) and an abundance-weighted PD measure (right), where taxa present in a sample are shown as circles and abundances are shown as the size of the circles. Unweighted PD takes the total length of branches sitting between tree tips represented in a sample. Abundance-weighted measures take a weighted sum of branch lengths where weight is determined in some way by the abundance of the taxa on either side of the branch: if we give edges width according to their weight, the abundance-weighted measure can be thought of as the sum of the total area of the edges. One such abundance-weighted measure simply takes the absolute value of the difference of the total read abundance on one side compared with the other.
F<sc>igure</sc> 2.
Figure 2.
The UniFrac divergence measure (figure adapted from Lozupone and Knight 2005). Assume that the sequence data to build the phylogenetic tree derives from two samples: the light-shaded sample and the dark-shaded sample (green and blue in the online version). When the samples are interspersed across the tree (left tree), they have a smaller fraction of branch length that sits ancestral to clades that are uniquely composed of one sample or another, compared with when they are separate (right tree). The bottom pictorial equation shows the ratios of interest for UniFrac: the branch length unique to one sample divided by the total branch length. The ratio is smaller when the samples are interspersed (left) than they are when separate (right tree).
F<sc>igure</sc> 3.
Figure 3.
Part of a minimal mass movement to calculate the earth-mover's distance between two probability distributions on a phylogenetic tree. For this, each probability distribution is considered as a configuration of dirt piles (round bumps in the figure) on the tree, and the distance between two such dirt pile configurations is defined to be the minimum amount of physical “work” required to move the dirt in one configuration to the other.

Similar articles

Cited by

References

    1. Abubucker S., Segata N., Goll J., Schubert A. M., Izard J., Cantarel B. L., Rodriguez-Mueller B., Zucker J., Thiagarajan M., Henrissat B., White O., Kelley S. T., Meth B., Schloss P. D., Gevers D., Mitreva M., Huttenhower C. Metabolic reconstruction for metagenomic data and its application to the human microbiome. PLOS Comput. Biol. 2012;8:e1002358. - PMC - PubMed
    1. Aldous D. J., Krikun M. A., Popovic L. Five statistical questions about the tree of life. Syst. Biol. 2011;60:318–328. - PubMed
    1. Allen B., Kon M., Bar-Yam Y. A new phylogenetic diversity measure generalizing the Shannon index and its application to phyllostomid bats. American Naturalist. 2009;174:236–243. - PubMed
    1. Anders S., Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11:R106. - PMC - PubMed
    1. Ashelford K. E., Chuzhanova N. A., Fry J. C., Jones A. J., Weightman A. J. At least 1 in 20 16S rRNA sequence records currently held in public repositories is estimated to contain substantial anomalies. Appl. Environ. Microbiol. 2005;71:7724–7736. - PMC - PubMed

Publication types

Substances