Origins of De Novo Genes in Human and Chimpanzee

PLoS Genet. 2015 Dec 31;11(12):e1005721. doi: 10.1371/journal.pgen.1005721. eCollection 2015 Dec.

Abstract

The birth of new genes is an important motor of evolutionary innovation. Whereas many new genes arise by gene duplication, others originate at genomic regions that did not contain any genes or gene copies. Some of these newly expressed genes may acquire coding or non-coding functions and be preserved by natural selection. However, it is yet unclear which is the prevalence and underlying mechanisms of de novo gene emergence. In order to obtain a comprehensive view of this process, we have performed in-depth sequencing of the transcriptomes of four mammalian species--human, chimpanzee, macaque, and mouse--and subsequently compared the assembled transcripts and the corresponding syntenic genomic regions. This has resulted in the identification of over five thousand new multiexonic transcriptional events in human and/or chimpanzee that are not observed in the rest of species. Using comparative genomics, we show that the expression of these transcripts is associated with the gain of regulatory motifs upstream of the transcription start site (TSS) and of U1 snRNP sites downstream of the TSS. In general, these transcripts show little evidence of purifying selection, suggesting that many of them are not functional. However, we find signatures of selection in a subset of de novo genes which have evidence of protein translation. Taken together, the data support a model in which frequently-occurring new transcriptional events in the genome provide the raw material for the evolution of new proteins.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Base Sequence
  • Evolution, Molecular*
  • Female
  • Gene Expression
  • Genes*
  • Genome, Human*
  • Humans
  • Macaca / genetics
  • Male
  • Mice
  • Pan troglodytes / genetics*
  • Promoter Regions, Genetic
  • Regulatory Sequences, Nucleic Acid
  • Ribonucleoprotein, U1 Small Nuclear / genetics*
  • Testis / physiology
  • Transcription Initiation Site

Substances

  • Ribonucleoprotein, U1 Small Nuclear

Associated data

  • figshare/10.6084/M9.FIGSHARE.1604892
  • figshare/10.6084/M9.FIGSHARE.1604893

Grant support

The main grant was BFU2012-36820 from the Spanish Government, which was co-funded by the European Regional Development Fund (FEDER). Another grant was from Instituto de Salud Carlos III, Gobierno de España, grant number PT13/0001. We also received funds from Agència de Gestió d'Ajuts Universitaris i de Recerca Generalitat de Catalunya, grant number 2014SGR1121. Another funding source was the European Molecular Biology Organization Young Investigators Program 2014 grant awarded to TMB. TMB was also supported by MICINN BFU2014-55090-P, BFU2015-7116-ERC and BFU2015-6215-ERC (www.mecd.gob.es). MA and TMB were supported by ICREA Institut Català de Recerca i Estudis Avançats, Generalitat de Catalunya. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.