Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Mar 17;9:61.
doi: 10.1186/1471-2148-9-61.

Inferring Phylogenies With Incomplete Data Sets: A 5-gene, 567-taxon Analysis of Angiosperms

Affiliations
Free PMC article

Inferring Phylogenies With Incomplete Data Sets: A 5-gene, 567-taxon Analysis of Angiosperms

J Gordon Burleigh et al. BMC Evol Biol. .
Free PMC article

Abstract

Background: Phylogenetic analyses of angiosperm relationships have used only a small percentage of available sequence data, but phylogenetic data matrices often can be augmented with existing data, especially if one allows missing characters. We explore the effects on phylogenetic analyses of adding 378 matK sequences and 240 26S rDNA sequences to the complete 3-gene, 567-taxon angiosperm phylogenetic matrix of Soltis et al.

Results: We performed maximum likelihood bootstrap analyses of the complete, 3-gene 567-taxon data matrix and the incomplete, 5-gene 567-taxon data matrix. Although the 5-gene matrix has more missing data (27.5%) than the 3-gene data matrix (2.9%), the 5-gene analysis resulted in higher levels of bootstrap support. Within the 567-taxon tree, the increase in support is most evident for relationships among the 170 taxa for which both matK and 26S rDNA sequences were added, and there is little gain in support for relationships among the 119 taxa having neither matK nor 26S rDNA sequences. The 5-gene analysis also places the enigmatic Hydrostachys in Lamiales (BS = 97%) rather than in Cornales (BS = 100% in 3-gene analysis). The placement of Hydrostachys in Lamiales is unprecedented in molecular analyses, but it is consistent with embryological and morphological data.

Conclusion: Adding available, and often incomplete, sets of sequences to existing data sets can be a fast and inexpensive way to increase support for phylogenetic relationships and produce novel and credible new phylogenetic hypotheses.

Figures

Figure 1
Figure 1
Diagram representing the distribution of data in the total 5-gene data matrix. All taxa in the matrix contain sequences from the first 3 genes (18S rDNA, atpB, and rbcL), 378 taxa have matK sequences, and 240 taxa have 26S rDNA sequences. Only 170 taxa have sequences from both matK and 26S rDNA, and 119 taxa have no matK or 26S rDNA sequences.
Figure 2
Figure 2
Summary of the majority rule consensus from the 3-gene (18S rDNA, atpB, and rbcL) ML analysis. Names of the orders and informal names follow APG II [4] and Soltis et al. [2,3], with Hydrostachys in Cornales. Numbers above the branches are bootstrap percentages. This tree was rooted using all gymnosperm taxa as outgroups.
Figure 3
Figure 3
Summary of the majority rule consensus from the 5-gene (18S rDNA, atpB, rbcL, matK, and 26S rDNA) ML analysis. Names of the orders and informal names follow APG II [4] and Soltis et al. [2,3], with Hydrostachys in Lamiales. Numbers above the branches are bootstrap percentages. This tree was rooted using all gymnosperm taxa as outgroups.
Figure 4
Figure 4
Detail of the position of Hydrostachys within Cornales in the majority rule consensus from the 3-gene ML analysis.
Figure 5
Figure 5
Detail of the position of Hydrostachys within Lamiales in the majority rule consensus from the 5-gene ML analysis.

Similar articles

See all similar articles

Cited by 18 articles

See all "Cited by" articles

References

    1. Soltis PS, Soltis DE, Chase MW. Angiosperm phylogeny inferred from multiple genes as a tool for comparative biology. Nature. 1999;402:402–404. doi: 10.1038/46528. - DOI - PubMed
    1. Soltis DE, Soltis PS, Chase MW, Mort ME, Albach DC, Zanis M, Savolainen V, Hahn WH, Hoot SB, Fay MF, Axtell M, Swensen SM, Prince LM, Kress WJ, Nixon KC, Farris JS. Angiosperm phylogeny inferred from 18S rDNA, rbcL, and atpB sequences. Biol J Linn Soc. 2000;133:381–461.
    1. Soltis DE, Gitzendanner MA, Soltis PS. A 567-taxon data set for angiosperms: the challenges posed by Bayesian analyses of large data sets. Int J Plant Sci. 2007;168:137–157. doi: 10.1086/509788. - DOI
    1. APG II (Angiosperm Phylogeny Group II) An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants. Bot J Linn Soc. 2003;141:399–436. doi: 10.1046/j.1095-8339.2003.t01-1-00158.x. - DOI
    1. Soltis DE, Soltis PS, Endress PK, Chase MW. Phylogeny and evolution of angiosperms. Sunderland, Massachusetts: Sinauer; 2005.

Publication types

LinkOut - more resources

Feedback