Small Proteins Encoded by Unannotated ORFs are Rising Stars of the Proteome, Confirming Shortcomings in Genome Annotations and Current Vision of an mRNA

Proteomics. 2018 May;18(10):e1700058. doi: 10.1002/pmic.201700058. Epub 2017 Oct 11.


Short ORF-encoded peptides and small proteins in eukaryotes have been hiding in the shadow of large proteins for a long time. Recently, improved identifications in MS-based proteomics and ribosome profiling resulted in the detection of large numbers of small proteins. The variety of functions of small proteins is also emerging. It seems to be the right time to reflect on why small proteins remained invisible. In addition to the obvious technical challenge of detecting small proteins, they were mostly forgotten from annotations and they escaped detection because they were not sought. In this review, we identify conventions that need to be revisited, including the assumption that mature mRNAs carry only one coding sequence. The large-scale discovery of small proteins and of their functions will require changing some paradigms and undertaking the annotation of ORFs that are still largely perceived as irrelevant coding information compared to already annotated coding sequences.

Keywords: Alternative translation; Short ORFs; mRNA.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Genome, Human
  • Genomics
  • Humans
  • Molecular Sequence Annotation*
  • Open Reading Frames*
  • Protein Biosynthesis*
  • Proteins / genetics
  • Proteins / metabolism*
  • Proteome / metabolism*
  • RNA, Messenger / genetics
  • RNA, Messenger / metabolism*
  • Ribosomes


  • Proteins
  • Proteome
  • RNA, Messenger