Compilation and analysis of sequences upstream from the translational start site in eukaryotic mRNAs

Nucleic Acids Res. 1984 Jan 25;12(2):857-72. doi: 10.1093/nar/12.2.857.


5-Noncoding sequences have been tabulated for 211 messenger RNAs from higher eukaryotic cells. The 5'-proximal AUG triplet serves as the initiator codon in 95% of the mRNAs examined. The most conspicuous conserved feature is the presence of a purine (most often A) three nucleotides upstream from the AUG initiator codon; only 6 of the mRNAs in the survey have a pyrimidine in that position. There is a predominance of C in positions -1, -2, -4 and -5, just upstream from the initiator codon. The sequence CCAGCCAUG (G) thus emerges as a consensus sequence for eukaryotic initiation sites. The extent to which the ribosome binding site in a given mRNA matches the -1 to -5 consensus sequence varies: more than half of the mRNAs in the tabulation have 3 or 4 nucleotides in common with the CCACC consensus, but only ten mRNAs conform perfectly.

Publication types

  • Comparative Study

MeSH terms

  • Base Sequence*
  • Protein Biosynthesis*
  • Proteins / genetics
  • RNA, Messenger / genetics*
  • Structure-Activity Relationship


  • Proteins
  • RNA, Messenger