Polycistronic peptide coding genes in eukaryotes--how widespread are they?

Brief Funct Genomic Proteomic. 2009 Jan;8(1):68-74. doi: 10.1093/bfgp/eln054. Epub 2008 Dec 12.

Abstract

The classical textbook assumption for the structure of an eukaryotic gene is that it codes for a single polypeptide of more than 100 amino acids in length. This is also the implicit assumption in most gene annotation pipelines. A gene family has now been discovered in insects that shows that an eukaryotic mRNA can code for peptides as short as eleven amino acids and that a single mRNA can code for several such peptides. This raises the question whether short open reading frames might also have a functional potential in other mRNAs, in particular those that occur in the 5'-UTR of many mRNAs. A number of these have been shown to act in cis to regulate the translation of the main open reading frame of the mRNA. But there may be others that could act in trans on other biological processes. The question of how many peptide-coding genes may exist is therefore worth revisiting. This poses new bioinformatic challenges that can only be resolved through multiple genome comparisons within a range of evolutionary distances.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • 5' Untranslated Regions
  • Amino Acid Sequence
  • Animals
  • Base Sequence
  • Computational Biology / methods
  • Drosophila / genetics
  • Gene Expression Regulation*
  • Humans
  • Models, Genetic
  • Molecular Sequence Data
  • Open Reading Frames
  • Peptides / chemistry
  • Peptides / genetics*
  • Protein Biosynthesis*
  • RNA, Messenger / metabolism
  • Sequence Alignment

Substances

  • 5' Untranslated Regions
  • Peptides
  • RNA, Messenger