Next-generation phylogenomics

Cheong Xin Chan; Mark A Ragan

doi:10.1186/1745-6150-8-3

Next-generation phylogenomics

Biol Direct. 2013 Jan 22:8:3. doi: 10.1186/1745-6150-8-3.

Authors

Cheong Xin Chan¹, Mark A Ragan

Affiliation

¹ Institute for Molecular Bioscience, and ARC Centre of Excellence in Bioinformatics, The University of Queensland, Brisbane, QLD, 4072, Australia.

Abstract

Thanks to advances in next-generation technologies, genome sequences are now being generated at breadth (e.g. across environments) and depth (thousands of closely related strains, individuals or samples) unimaginable only a few years ago. Phylogenomics--the study of evolutionary relationships based on comparative analysis of genome-scale data--has so far been developed as industrial-scale molecular phylogenetics, proceeding in the two classical steps: multiple alignment of homologous sequences, followed by inference of a tree (or multiple trees). However, the algorithms typically employed for these steps scale poorly with number of sequences, such that for an increasing number of problems, high-quality phylogenomic analysis is (or soon will be) computationally infeasible. Moreover, next-generation data are often incomplete and error-prone, and analysis may be further complicated by genome rearrangement, gene fusion and deletion, lateral genetic transfer, and transcript variation. Here we argue that next-generation data require next-generation phylogenomics, including so-called alignment-free approaches.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Evolution, Molecular
Genome
Genomics / methods*
Phylogeny*
Sequence Alignment
Sequence Analysis, DNA / methods*