The perils of intralocus recombination for inferences of molecular convergence

Philos Trans R Soc Lond B Biol Sci. 2019 Jul 22;374(1777):20180244. doi: 10.1098/rstb.2018.0244. Epub 2019 Jun 3.

Abstract

Accurate inferences of convergence require that the appropriate tree topology be used. If there is a mismatch between the tree a trait has evolved along and the tree used for analysis, then false inferences of convergence ('hemiplasy') can occur. To avoid problems of hemiplasy when there are high levels of gene tree discordance with the species tree, researchers have begun to construct tree topologies from individual loci. However, due to intralocus recombination, even locus-specific trees may contain multiple topologies within them. This implies that the use of individual tree topologies discordant with the species tree can still lead to incorrect inferences about molecular convergence. Here, we examine the frequency with which single exons and single protein-coding genes contain multiple underlying tree topologies, in primates and Drosophila, and quantify the effects of hemiplasy when using trees inferred from individual loci. In both clades, we find that there are most often multiple diagnosable topologies within single exons and whole genes, with 91% of Drosophila protein-coding genes containing multiple topologies. Because of this underlying topological heterogeneity, even using trees inferred from individual protein-coding genes results in 25% and 38% of substitutions falsely labelled as convergent in primates and Drosophila, respectively. While constructing local trees can reduce the problem of hemiplasy, our results suggest that it will be difficult to completely avoid false inferences of convergence. We conclude by suggesting several ways forward in the analysis of convergent evolution, for both molecular and morphological characters. This article is part of the theme issue 'Convergent evolution in the genomics era: new insights and directions'.

Keywords: gene tree discordance; hemiplasy; homoplasy; incomplete lineage sorting; molecular convergence.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Drosophila / classification
  • Drosophila / genetics*
  • Evolution, Molecular*
  • Exons
  • Phylogeny
  • Primates / classification
  • Primates / genetics*
  • Proteins / genetics
  • Recombination, Genetic*

Substances

  • Proteins

Associated data

  • figshare/10.6084/m9.figshare.c.4511594