New genes from non-coding sequence: the role of de novo protein-coding genes in eukaryotic evolutionary innovation

Philos Trans R Soc Lond B Biol Sci. 2015 Sep 26;370(1678):20140332. doi: 10.1098/rstb.2014.0332.


The origin of novel protein-coding genes de novo was once considered so improbable as to be impossible. In less than a decade, and especially in the last five years, this view has been overturned by extensive evidence from diverse eukaryotic lineages. There is now evidence that this mechanism has contributed a significant number of genes to genomes of organisms as diverse as Saccharomyces, Drosophila, Plasmodium, Arabidopisis and human. From simple beginnings, these genes have in some instances acquired complex structure, regulated expression and important functional roles. New genes are often thought of as dispensable late additions; however, some recent de novo genes in human can play a role in disease. Rather than an extremely rare occurrence, it is now evident that there is a relatively constant trickle of proto-genes released into the testing ground of natural selection. It is currently unknown whether de novo genes arise primarily through an 'RNA-first' or 'ORF-first' pathway. Either way, evolutionary tinkering with this pool of genetic potential may have been a significant player in the origins of lineage-specific traits and adaptations.

Keywords: de novo genes; evolution; open reading frame; proto-genes.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Animals
  • Biological Evolution*
  • Eukaryota / genetics*
  • Gene Expression Regulation / genetics*
  • Humans
  • Plants / genetics*
  • RNA, Untranslated / genetics*


  • RNA, Untranslated