Counting and optimising maximum phylogenetic diversity sets

J Math Biol. 2022 Jul 16;85(1):11. doi: 10.1007/s00285-022-01779-3.


In conservation biology, phylogenetic diversity (PD) provides a way to quantify the impact of the current rapid extinction of species on the evolutionary 'Tree of Life'. This approach recognises that extinction not only removes species but also the branches of the tree on which unique features shared by the extinct species arose. In this paper, we investigate three questions that are relevant to PD. The first asks how many sets of species of given size k preserve the maximum possible amount of PD in a given tree. The number of such maximum PD sets can be very large, even for moderate-sized phylogenies. We provide a combinatorial characterisation of maximum PD sets, focusing on the setting where the branch lengths are ultrametric (e.g. proportional to time). This leads to a polynomial-time algorithm for calculating the number of maximum PD sets of size k by applying a generating function; we also investigate the types of tree shapes that harbour the most (or fewest) maximum PD sets of size k. Our second question concerns optimising a linear function on the species (regarded as leaves of the phylogenetic tree) across all the maximum PD sets of a given size. Using the characterisation result from the first question, we show how this optimisation problem can be solved in polynomial time, even though the number of maximum PD sets can grow exponentially. Our third question considers a dual problem: If k species were to become extinct, then what is the largest possible loss of PD in the resulting tree? For this question, we describe a polynomial-time solution based on dynamical programming.

Keywords: Algorithms; Biodiversity measures; Enumeration; Optimisation; Phylogenetic diversity; Phylogenetic tree.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biodiversity*
  • Biological Evolution*
  • Phylogeny