Next-generation phylogenomics

Biol Direct. 2013 Jan 22:8:3. doi: 10.1186/1745-6150-8-3.

Abstract

Thanks to advances in next-generation technologies, genome sequences are now being generated at breadth (e.g. across environments) and depth (thousands of closely related strains, individuals or samples) unimaginable only a few years ago. Phylogenomics--the study of evolutionary relationships based on comparative analysis of genome-scale data--has so far been developed as industrial-scale molecular phylogenetics, proceeding in the two classical steps: multiple alignment of homologous sequences, followed by inference of a tree (or multiple trees). However, the algorithms typically employed for these steps scale poorly with number of sequences, such that for an increasing number of problems, high-quality phylogenomic analysis is (or soon will be) computationally infeasible. Moreover, next-generation data are often incomplete and error-prone, and analysis may be further complicated by genome rearrangement, gene fusion and deletion, lateral genetic transfer, and transcript variation. Here we argue that next-generation data require next-generation phylogenomics, including so-called alignment-free approaches.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Evolution, Molecular
  • Genome
  • Genomics / methods*
  • Phylogeny*
  • Sequence Alignment
  • Sequence Analysis, DNA / methods*