Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jan 1;66(1):e83-e94.
doi: 10.1093/sysbio/syw025.

Review Paper: The Shape of Phylogenetic Treespace

Free PMC article

Review Paper: The Shape of Phylogenetic Treespace

Katherine St. John. Syst Biol. .
Free PMC article


Trees are a canonical structure for representing evolutionary histories. Many popular criteria used to infer optimal trees are computationally hard, and the number of possible tree shapes grows super-exponentially in the number of taxa. The underlying structure of the spaces of trees yields rich insights that can improve the search for optimal trees, both in accuracy and in running time, and the analysis and visualization of results. We review the past work on analyzing and comparing trees by their shape as well as recent work that incorporates trees with weighted branch lengths.

Keywords: Maximum likelihood; maximum parsimony; tree metrics; treespace.


F<sc>igure</sc> 1.
Figure 1.
a) An analogy to organizing points via different metrics is the points reached in walking 10 minutes (dark shaded regions) versus the points reached by walking or transit in 10 minutes time (light shaded regions). Image generated with Isoscope (Gortana et al. 2014). b) Similarly, an NNI (dark shaded regions) and SPR (light shaded regions) neighborhood of the same point in the 7-leaf treespace.
F<sc>igure</sc> 2.
Figure 2.
Tree rearrangements: a) The starting tree, b) the interchange of neighboring subtrees yields a tree one Nearest Neighbor Interchange (NNI) move away, c) A Subtree Prune and Regraft (SPR) move: the subtree (A,B) is pruned from the initial trees and reattached, and (d) a Tree Bisection and Reconnection (TBR) move: the edge separating ABC from DEFG is bisected and reconnected by a new edge.
F<sc>igure</sc> 3.
Figure 3.
a) A tree on 5 leaves. Each edge induces a bipartition or split on the leaves, for example the internal edges induce the splits: 12 345 and 123|45. b) The same tree with branch lengths. In the orthant, the horizontal axis corresponds to the weight of 12|345, and the vertical axi|s to the weight of 123|45.
F<sc>igure</sc> 4.
Figure 4.
a) Representing trees, T0, T1, and T2, as vectors of splits. b) The Robinson–Foulds (Manhattan or L1) distance is the sum of the pairwise differences which is 2 for all three pairs of these trees. The Branch Distance Score (Euclidean or L2) is 2 for all three pairs of these trees. The BHV metric seeks the shortest path inside the space. For the pairs of trees T0 and T1 and T0 and T2, the distance matches the Euclidean distance of 2. For the trees, T1 and T2 which lie on different orthants, the distance is 2.
F<sc>igure</sc> 5.
Figure 5.
a) The NNI treespace of 5-leaf trees. Nodes are labeled using extended split notation: “12|3|45” refers to the tree with splits “12 345” and “123|45”. The highlighted circle corresponds to the orthants illustrated in the BHV space for unrooted 5-leaf trees b). The shortest|path (geodesic) between trees depends both on the tree shape and the branch lengths. The dashed lines show geodesics that visit auxiliary orthants, whereas the dotted path passes through the origin.
F<sc>igure</sc> 6.
Figure 6.
The shaded region contains all trees with the splits 12|345 and 123|45 that are within distance 1 of the star tree (origin) under L1 (Robinson–Foulds), b) L2 (Branch Score Distance), and c) L (maximum branch) distance.
F<sc>igure</sc> 7.
Figure 7.
a) Three 5-leaf trees that differ by a single NNI moves (arrows). b) The same tree shapes represented in the continuous treespace. Each orthant contains all trees with the same underlying topology.
F<sc>igure</sc> 8.
Figure 8.
a) The three possible rooted triples on leaves {1,2,3} and b) the three possible quartets on {1,2,3,4}.

Similar articles

See all similar articles

Cited by 3 articles


    1. Allen B., Steel M.: 2001. Subtree transfer operations and their induced metrics on evolutionary trees. Ann. Combinatorics 5: 1–13.
    1. Amenta N., Clarke F., St. John K.: 2003. A linear-time majority tree. In: Lecture Notes in Bioinformatics (subseries of Lecture Notes in Computer Science) Third International Workshop, WABI 2003 (Workshop on Algorithms in Biology), Budapest, Hungary, volume 2812, p. 216–227.
    1. Amenta N., Godwin M., Postarnakevich N., John St., K.: 2007. Approximating geodesic tree distance. Informat. Process. Lettersrocessing Lett. 103(2): 61–65.
    1. Bačák M.: 2012. A novel algorithm for computing the Fréchet mean in Hadamard spaces. arXiv 1210.2145v1.
    1. Bandelt H.-J., Dress A.: 1986. Reconstructing the shape of a tree from observed dissimilarity data. Adv. Appl. Math. 7(3): 309–343.

Publication types