De novo emergence of adaptive membrane proteins from thymine-rich genomic sequences

Nat Commun. 2020 Feb 7;11(1):781. doi: 10.1038/s41467-020-14500-z.


Recent evidence demonstrates that novel protein-coding genes can arise de novo from non-genic loci. This evolutionary innovation is thought to be facilitated by the pervasive translation of non-genic transcripts, which exposes a reservoir of variable polypeptides to natural selection. Here, we systematically characterize how these de novo emerging coding sequences impact fitness in budding yeast. Disruption of emerging sequences is generally inconsequential for fitness in the laboratory and in natural populations. Overexpression of emerging sequences, however, is enriched in adaptive fitness effects compared to overexpression of established genes. We find that adaptive emerging sequences tend to encode putative transmembrane domains, and that thymine-rich intergenic regions harbor a widespread potential to produce transmembrane domains. These findings, together with in-depth examination of the de novo emerging YBR196C-A locus, suggest a novel evolutionary model whereby adaptive transmembrane polypeptides emerge de novo from thymine-rich non-genic regions and subsequently accumulate changes molded by natural selection.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Adaptation, Biological / genetics
  • Endoplasmic Reticulum / genetics
  • Endoplasmic Reticulum / metabolism
  • Evolution, Molecular*
  • Gene Expression Regulation, Fungal
  • Genetic Fitness
  • Intracellular Membranes / metabolism
  • Membrane Proteins / chemistry
  • Membrane Proteins / genetics*
  • Open Reading Frames
  • Protein Domains / genetics
  • Saccharomyces cerevisiae / genetics
  • Saccharomyces cerevisiae Proteins / genetics*
  • TATA-Binding Protein Associated Factors / genetics*
  • Thymine*
  • Transcription Factor TFIID / genetics*


  • Membrane Proteins
  • Saccharomyces cerevisiae Proteins
  • TAF5 protein, S cerevisiae
  • TATA-Binding Protein Associated Factors
  • Transcription Factor TFIID
  • Thymine