The life history of retrocopies illuminates the evolution of new mammalian genes

Genome Res. 2016 Mar;26(3):301-14. doi: 10.1101/gr.198473.115. Epub 2016 Jan 4.


New genes contribute substantially to adaptive evolutionary innovation, but the functional evolution of new mammalian genes has been little explored at a broad scale. Previous work established mRNA-derived gene duplicates, known as retrocopies, as models for the study of new gene origination. Here we combine mammalian transcriptomic and epigenomic data to unveil the processes underlying the evolution of stripped-down retrocopies into complex new genes. We show that although some robustly expressed retrocopies are transcribed from preexisting promoters, most evolved new promoters from scratch or recruited proto-promoters in their genomic vicinity. In particular, many retrocopy promoters emerged from ancestral enhancers (or bivalent regulatory elements) or are located in CpG islands not associated with other genes. We detected 88-280 selectively preserved retrocopies per mammalian species, illustrating that these mechanisms facilitated the birth of many functional retrogenes during mammalian evolution. The regulatory evolution of originally monoexonic retrocopies was frequently accompanied by exon gain, which facilitated co-option of distant promoters and allowed expression of alternative isoforms. While young retrogenes are often initially expressed in the testis, increased regulatory and structural complexities allowed retrogenes to functionally diversify and evolve somatic organ functions, sometimes as complex as those of their parents. Thus, some retrogenes evolved the capacity to temporarily substitute for their parents during the process of male meiotic X inactivation, while others rendered parental functions superfluous, allowing for parental gene loss. Overall, our reconstruction of the "life history" of mammalian retrogenes highlights retroposition as a general model for understanding new gene birth and functional evolution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Computational Biology / methods
  • Epigenesis, Genetic
  • Evolution, Molecular*
  • Gene Expression Profiling
  • Genes
  • Humans
  • Mammals / genetics*
  • Molecular Sequence Annotation
  • Phylogeny
  • Promoter Regions, Genetic
  • Regulatory Sequences, Nucleic Acid
  • Retroelements*
  • Transcriptome
  • Vertebrates / genetics


  • Retroelements