A survey of mRNA sequences with a non-AUG start codon in RefSeq database

J Biomol Struct Dyn. 2006 Aug;24(1):33-42. doi: 10.1080/07391102.2006.10507096.

Abstract

Alternative initiation in translation is one of the important mechanisms in which multiple proteins are synthesized from a single mRNA. In many cases, translation initiation occurring at a non-AUG codon has been reported by several experimental studies. We have analyzed all mRNA sequences in the RefSeq database and found that coding regions of about 0.1% of the total mRNA sequences begin with a non-AUG codon (nonAUG mRNAs). Major fraction of non-AUG mRNAs is predicted from genomic sequences. More than 100 non-AUG sequences are highly curated and 52 of them are explicitly annotated that they use alternate start codons for translation initiation. Analysis of these sequences reveals that majority of the protein products contain domains that are DNA/RNA-binding, kinases, growth factors, or involved in immune response or cell proliferation. Thus, the proteins translated from non-canonical codons seem to be implicated in regulatory role and/or signaling mechanism. The sequence context of the non-AUG start codons shows that purine at -3 position and/or G at +4 position are strongly conserved and the corresponding genes give rise to alternate transcripts and/or multiple isoforms. We have also developed a database "nonAUG" (http://bioinfo.iitk.ac.in) that contains a collection of all mRNA sequences whose coding regions start with a non-AUG codon. nonAUG database will be continuously updated and is freely available to the scientific community.

MeSH terms

  • Codon, Initiator*
  • Databases, Genetic*
  • RNA, Messenger*
  • Sequence Analysis, RNA*

Substances

  • Codon, Initiator
  • RNA, Messenger