Evaluating Phylostratigraphic Evidence for Widespread De Novo Gene Birth in Genome Evolution

Mol Biol Evol. 2016 May;33(5):1245-56. doi: 10.1093/molbev/msw008. Epub 2016 Jan 11.


The source of genetic novelty is an area of wide interest and intense investigation. Although gene duplication is conventionally thought to dominate the production of new genes, this view was recently challenged by a proposal of widespread de novo gene origination in eukaryotic evolution. Specifically, distributions of various gene properties such as coding sequence length, expression level, codon usage, and probability of being subject to purifying selection among groups of genes with different estimated ages were reported to support a model in which new protein-coding proto-genes arise from noncoding DNA and gradually integrate into cellular networks. Here we show that the genomic patterns asserted to support widespread de novo gene origination are largely attributable to biases in gene age estimation by phylostratigraphy, because such patterns are also observed in phylostratigraphic analysis of simulated genes bearing identical ages. Furthermore, there is no evidence of purifying selection on very young de novo genes previously claimed to show such signals. Together, these findings are consistent with the prevailing view that de novo gene birth is a relatively minor contributor to new genes in genome evolution. They also illustrate the danger of using phylostratigraphy in the study of new gene origination without considering its inherent bias.

Keywords: BLAST; gene age; new genes; phylostratigraphy; proto-gene; yeast..

MeSH terms

  • Animals
  • Biological Evolution*
  • Codon
  • Computer Simulation
  • Databases, Nucleic Acid
  • Evolution, Molecular
  • Gene Duplication
  • Genomics / methods*
  • Humans
  • Models, Genetic*
  • Mutation*
  • Open Reading Frames
  • Phylogeny
  • Saccharomyces cerevisiae / genetics
  • Saccharomyces cerevisiae Proteins / genetics


  • Codon
  • Saccharomyces cerevisiae Proteins