Impossibility of Consistent Distance Estimation from Sequence Lengths Under the TKF91 Model

Bull Math Biol. 2020 Sep 13;82(9):123. doi: 10.1007/s11538-020-00801-3.

Abstract

We consider the problem of distance estimation under the TKF91 model of sequence evolution by insertions, deletions and substitutions on a phylogeny. In an asymptotic regime where the expected sequence lengths tend to infinity, we show that no consistent distance estimation is possible from sequence lengths alone. More formally, we establish that the distributions of pairs of sequence lengths at different distances cannot be distinguished with probability going to one.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Base Sequence
  • Evolution, Molecular*
  • Mathematical Concepts
  • Models, Genetic*
  • Phylogeny
  • Probability