A 4-lineage Statistical Suite to Evaluate the Support of Large-Scale Retrotransposon Insertion Data to Reconstruct Evolutionary Trees

Syst Biol. 2023 Jun 17;72(3):649-661. doi: 10.1093/sysbio/syac082.

Abstract

Retrophylogenomics makes use of genome-wide retrotransposon presence/absence insertion patterns to resolve questions in phylogeny and population genetics. In the genomics era, evaluating high-throughput data requires the associated development of appropriately powerful statistical tools. The currently used KKSC 3-lineage statistical test for estimating the significance of retrophylogenomic data is limited by the number of possible tree topologies it can assess in one step. To improve on this, we have extended the analysis to simultaneously compare four lineages, enabling us to evaluate ten distinct presence/absence insertion patterns for 26 possible tree topologies plus 129 trees with different incidences of hybridization or introgression. The new tool provides statistics for cases involving multiple ancestral hybridizations/introgressions, ancestral incomplete lineage sorting, bifurcation, and polytomy. The test is embedded in a user-friendly web R application (http://retrogenomics.uni-muenster.de:3838/hammlet/) and is available for use by the scientific community. [ancestral hybridization/introgression; ancestral incomplete lineage sorting (ILS); empirical distribution; KKSC-statistics; 4-lineage (4-LIN) insertion polymorphism; polytomy; retrophylogenomics.].

MeSH terms

  • Biological Evolution*
  • Genomics
  • Phylogeny
  • Retroelements* / genetics
  • Software

Substances

  • Retroelements