Evidence for conservation and selection of upstream open reading frames suggests probable encoding of bioactive peptides

BMC Genomics. 2006 Jan 26;7:16. doi: 10.1186/1471-2164-7-16.


Background: Approximately 40% of mammalian mRNA sequences contain AUG trinucleotides upstream of the main coding sequence, with a quarter of these AUGs demarcating open reading frames of 20 or more codons. In order to investigate whether these open reading frames may encode functional peptides, we have carried out a comparative genomic analysis of human and mouse mRNA 'untranslated regions' using sequences from the RefSeq mRNA sequence database.

Results: We have identified over 200 upstream open reading frames which are strongly conserved between the human and mouse genomes. Consensus sequences associated with efficient initiation of translation are overrepresented at the AUG trinucleotides of these upstream open reading frames, while comparative analysis of their DNA and putative peptide sequences shows evidence of purifying selection.

Conclusion: The occurrence of a large number of conserved upstream open reading frames, in association with features consistent with protein translation, strongly suggests evolutionary maintenance of the coding sequence and indicates probable functional expression of the peptides encoded within these upstream open reading frames.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • 3' Untranslated Regions / genetics
  • 5' Untranslated Regions / genetics
  • Animals
  • Codon / genetics
  • Consensus Sequence / genetics
  • Evolution, Molecular*
  • Humans
  • Mice
  • Mutation
  • Open Reading Frames / genetics*
  • Peptide Chain Initiation, Translational / genetics
  • Peptides / genetics*
  • RNA, Messenger / genetics
  • Selection, Genetic*
  • Sequence Homology, Nucleic Acid
  • Species Specificity


  • 3' Untranslated Regions
  • 5' Untranslated Regions
  • Codon
  • Peptides
  • RNA, Messenger