Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2006 Jun 29;361(1470):1039-54.
doi: 10.1098/rstb.2006.1845.

The Origin and Diversification of Eukaryotes: Problems With Molecular Phylogenetics and Molecular Clock Estimation

Affiliations
Free PMC article
Review

The Origin and Diversification of Eukaryotes: Problems With Molecular Phylogenetics and Molecular Clock Estimation

Andrew J Roger et al. Philos Trans R Soc Lond B Biol Sci. .
Free PMC article

Abstract

Determining the relationships among and divergence times for the major eukaryotic lineages remains one of the most important and controversial outstanding problems in evolutionary biology. The sequencing and phylogenetic analyses of ribosomal RNA (rRNA) genes led to the first nearly comprehensive phylogenies of eukaryotes in the late 1980s, and supported a view where cellular complexity was acquired during the divergence of extant unicellular eukaryote lineages. More recently, however, refinements in analytical methods coupled with the availability of many additional genes for phylogenetic analysis showed that much of the deep structure of early rRNA trees was artefactual. Recent phylogenetic analyses of a multiple genes and the discovery of important molecular and ultrastructural phylogenetic characters have resolved eukaryotic diversity into six major hypothetical groups. Yet relationships among these groups remain poorly understood because of saturation of sequence changes on the billion-year time-scale, possible rapid radiations of major lineages, phylogenetic artefacts and endosymbiotic or lateral gene transfer among eukaryotes. Estimating the divergence dates between the major eukaryote lineages using molecular analyses is even more difficult than phylogenetic estimation. Error in such analyses comes from a myriad of sources including: (i) calibration fossil dates, (ii) the assumed phylogenetic tree, (iii) the nucleotide or amino acid substitution model, (iv) substitution number (branch length) estimates, (v) the model of how rates of evolution change over the tree, (vi) error inherent in the time estimates for a given model and (vii) how multiple gene data are treated. By reanalysing datasets from recently published molecular clock studies, we show that when errors from these various sources are properly accounted for, the confidence intervals on inferred dates can be very large. Furthermore, estimated dates of divergence vary hugely depending on the methods used and their assumptions. Accurate dating of divergence times among the major eukaryote lineages will require a robust tree of eukaryotes, a much richer Proterozoic fossil record of microbial eukaryotes assignable to extant groups for calibration, more sophisticated relaxed molecular clock methods and many more genes sampled from the full diversity of microbial eukaryotes.

Figures

Figure 1
Figure 1
Alternative views of the tree of eukaryotes. (a) The topology typically recovered in rRNA phylogenies in the 1990s (Sogin 1991; Cavalier-Smith & Chao 1996). Multifurcations indicate poorly supported branches or different branching orders depending on the taxonomic sampling. The grey-shaded region of the tree indicates the part of the rRNA tree that is likely artefactual, resulting from long-branch attraction (LBA). Note that the late-branching position of the Foraminifera is shown as recovered in later rRNA analyses (Nikolaev et al. 2004). (b) A hypothetical phylogeny indicating the six major supergroups of eukaryotes (see Simpson & Roger (2004) and Keeling et al. (2005) for recent reviews). Dotted branches indicate lineages that do not clearly fall within any of the major groups. The placement of the root of the tree of eukaryotes is indicated by dihydrofolate reductase (DHFR)–thymidylate synthase (TS) fusion data (Stechmann & Cavalier-Smith 2002) and myosin gene family data (Richards & Cavalier-Smith 2005). Alternative positions for the root (Arisue et al. 2005) are indicated by asterisks. The grey shaded region depicts the parts of this hypothetical tree of eukaryotes that are not strongly recovered (with greater than 85% bootstrap support) in published single or multiple gene phylogenies (e.g. Hampl et al. 2005; Simpson et al. 2006).
Figure 2
Figure 2
Changing among-site rate variation (ASRV) distributions in EF-1α homologues cause Microsporidia to artefactually branch at the base of eukaryotes (Inagaki et al. 2004). The ASRV distribution (indicated by shaded boxes) of microsporidian sequences is more similar to the archaebacterial sequences, possibly because of parallel loss of constraints at sites that are functionally conserved in other eukaryotes. Under these conditions, phylogenetic methods that assume equal rates at sites or a simple ASRV distribution artefactually recover the Microsporidia as branching basally to other eukaryotes, grouping with the archaebacterial outgroup.
Figure 3
Figure 3
Assumed topologies for molecular clock studies. (a) Topology used by Peterson–Butterfield (PB) in their analyses (Peterson & Butterfield 2005). (b) Topology from the Douzery (DZ) dataset (Douzery et al. 2004). The nodes under examination in the current study are labelled by the large numbers. Boxed numbers indicate fossil dated (in millions of years) constrained nodes taken from the original studies.
Figure 4
Figure 4
(a) Variation in age estimates and confidence intervals for the PB dataset under differing models of substitution. The trees and branch lengths were optimized by ML using Tree-Puzzle 5.2 (Schmidt et al. 2002) under the VT model (Müller et al. 2002) assuming equal rates or assuming a gamma distribution for ASRV (VT+Γ), or in PAUP* (Swofford 2000) using uncorrected distances and minimum evolution (ME). Age estimates and confidence intervals were generated using r8s (Sanderson 2003) under penalized likelihood (PL) with a logarithmic penalty with cross-validation optimization of the penalty coefficient. (b) Variation in age estimates under different molecular clock methods for the PB dataset. (c) Variation in age estimates under different molecular clock methods for the DZ dataset. Age estimates were generated for LF, NPRS and PL models in r8s using a tree with ML branch lengths using the VT+Γ model for the PB dataset and the Whelan and Goldman plus gamma (WAG+Γ) model (Whelan & Goldman 2001) for the DZ dataset. Bayesian estimates were generated using EST branches and Multidivtime5b (Kishino et al. 2001). (d) The effect of different schemes for constraining fossil dates on age estimates and confidence intervals. The branch lengths used were generated by ML with the VT+Γ model. Ages were generated in r8s employing either the NPRS or PL methods with a logarithmic penalty. Constraint models were either (i) all nodes fixed to the corresponding fossil date (‘all fixed’), (ii) nodes set with fossil dates as a minimum age and 1500 Myr as a maximum (‘upper limit’) or (iii) nodes set with their fossil dates as a minimum age and the corresponding fossil dates of the parent node age as a maximum (‘nearest-neighbour’). Cross-validation optimization of the PL penalty coefficient was not employed for analyses shown in (b), (c) and (d).
Figure 5
Figure 5
(a) Effect of bootstrapping on confidence intervals under penalized likelihood with a logarithmic penalty with cross-validation optimization of the penalty coefficient. 100 bootstraps of the PB dataset ML tree were generated using Puzzleboot (http://www.tree-puzzle.de) and Tree-Puzzle 5.2. In r8s, confidence intervals were generated for the single tree and the 100 bootstraps. Standard deviations from the bootstrapped trees were also obtained for the nodes of interest. (b) Effect of different priors under Bayesian analysis with Multidivtime5b. Two different prior distributions centred around two different root-to-tip age estimates were used and the posterior mean age estimates for nodes and their 95% credible intervals are shown. (c,d) Age estimates for datasets treated as a single large concatenate of genes or as ‘separate’ loci (Thorne & Kishino 2002). Estimates and 95% credible intervals for the PB dataset (c) and the DZ dataset (d) under these conditions are shown.

Similar articles

See all similar articles

Cited by 39 articles

  • Symbiosis in eukaryotic evolution.
    López-García P, Eme L, Moreira D. López-García P, et al. J Theor Biol. 2017 Dec 7;434:20-33. doi: 10.1016/j.jtbi.2017.02.031. Epub 2017 Feb 28. J Theor Biol. 2017. PMID: 28254477 Free PMC article.
  • The Astrobiology Primer v2.0.
    Domagal-Goldman SD, Wright KE, Adamala K, Arina de la Rubia L, Bond J, Dartnell LR, Goldman AD, Lynch K, Naud ME, Paulino-Lima IG, Singer K, Walther-Antonio M, Abrevaya XC, Anderson R, Arney G, Atri D, Azúa-Bustos A, Bowman JS, Brazelton WJ, Brennecka GA, Carns R, Chopra A, Colangelo-Lillis J, Crockett CJ, DeMarines J, Frank EA, Frantz C, de la Fuente E, Galante D, Glass J, Gleeson D, Glein CR, Goldblatt C, Horak R, Horodyskyj L, Kaçar B, Kereszturi A, Knowles E, Mayeur P, McGlynn S, Miguel Y, Montgomery M, Neish C, Noack L, Rugheimer S, Stüeken EE, Tamez-Hidalgo P, Imari Walker S, Wong T. Domagal-Goldman SD, et al. Astrobiology. 2016 Aug;16(8):561-653. doi: 10.1089/ast.2015.1460. Astrobiology. 2016. PMID: 27532777 Free PMC article. Review. No abstract available.
  • A Historical Overview of the Classification, Evolution, and Dispersion of Leishmania Parasites and Sandflies.
    Akhoundi M, Kuhls K, Cannet A, Votýpka J, Marty P, Delaunay P, Sereno D. Akhoundi M, et al. PLoS Negl Trop Dis. 2016 Mar 3;10(3):e0004349. doi: 10.1371/journal.pntd.0004349. eCollection 2016 Mar. PLoS Negl Trop Dis. 2016. PMID: 26937644 Free PMC article. Review.
  • On the age of eukaryotes: evaluating evidence from fossils and molecular clocks.
    Eme L, Sharpe SC, Brown MW, Roger AJ. Eme L, et al. Cold Spring Harb Perspect Biol. 2014 Aug 1;6(8):a016139. doi: 10.1101/cshperspect.a016139. Cold Spring Harb Perspect Biol. 2014. PMID: 25085908 Free PMC article. Review.
  • Paleobiological perspectives on early eukaryotic evolution.
    Knoll AH. Knoll AH. Cold Spring Harb Perspect Biol. 2014 Jan 1;6(1):a016121. doi: 10.1101/cshperspect.a016121. Cold Spring Harb Perspect Biol. 2014. PMID: 24384569 Free PMC article.
See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback