Distribution and asymptotic behavior of the phylogenetic transfer distance

J Math Biol. 2019 Jul;79(2):485-508. doi: 10.1007/s00285-019-01365-0. Epub 2019 Apr 29.

Abstract

The transfer distance (TD) was introduced in the classification framework and studied in the context of phylogenetic tree matching. Recently, Lemoine et al. (Nature 556(7702):452-456, 2018. https://doi.org/10.1038/s41586-018-0043-0 ) showed that TD can be a powerful tool to assess the branch support on large phylogenies, thus providing a relevant alternative to Felsenstein's bootstrap. This distance allows a reference branch[Formula: see text] in a reference tree [Formula: see text] to be compared to a branch b from another tree T (typically a bootstrap tree), both on the same set of n taxa. The TD between these branches is the number of taxa that must be transferred from one side of b to the other in order to obtain [Formula: see text]. By taking the minimum TD from [Formula: see text] to all branches in T we define the transfer index, denoted by [Formula: see text], measuring the degree of agreement of T with [Formula: see text]. Let us consider a reference branch [Formula: see text] having p tips on its light side and define the transfer support (TS) as [Formula: see text]. Lemoine et al. (2018) used computer simulations to show that the TS defined in this manner is close to 0 for random "bootstrap" trees. In this paper, we demonstrate that result mathematically: when T is randomly drawn, TS converges in probability to 0 when n tends to [Formula: see text]. Moreover, we fully characterize the distribution of [Formula: see text] on caterpillar trees, indicating that the convergence is fast, and that even when n is small, moderate levels of branch support cannot appear by chance.

Keywords: Concentration inequalities; Distances between bipartitions and phylogenies; Lattice paths; Phylogenetic trees; R-distance; Random phylogenies.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computer Simulation
  • Gene Transfer, Horizontal*
  • Models, Genetic*
  • Phylogeny*