Identification of evolutionarily conserved non-AUG-initiated N-terminal extensions in human coding sequences

Nucleic Acids Res. 2011 May;39(10):4220-34. doi: 10.1093/nar/gkr007. Epub 2011 Jan 25.

Abstract

In eukaryotes, it is generally assumed that translation initiation occurs at the AUG codon closest to the messenger RNA 5' cap. However, in certain cases, initiation can occur at codons differing from AUG by a single nucleotide, especially the codons CUG, UUG, GUG, ACG, AUA and AUU. While non-AUG initiation has been experimentally verified for a handful of human genes, the full extent to which this phenomenon is utilized--both for increased coding capacity and potentially also for novel regulatory mechanisms--remains unclear. To address this issue, and hence to improve the quality of existing coding sequence annotations, we developed a methodology based on phylogenetic analysis of predicted 5' untranslated regions from orthologous genes. We use evolutionary signatures of protein-coding sequences as an indicator of translation initiation upstream of annotated coding sequences. Our search identified novel conserved potential non-AUG-initiated N-terminal extensions in 42 human genes including VANGL2, FGFR1, KCNN4, TRPV6, HDGF, CITED2, EIF4G3 and NTF3, and also affirmed the conservation of known non-AUG-initiated extensions in 17 other genes. In several instances, we have been able to obtain independent experimental evidence of the expression of non-AUG-initiated products from the previously published literature and ribosome profiling data.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • 5' Untranslated Regions
  • Alternative Splicing
  • Base Sequence
  • Blotting, Western
  • Codon, Initiator / chemistry*
  • Conserved Sequence
  • Evolution, Molecular*
  • Humans
  • Phylogeny
  • RNA, Messenger / chemistry
  • Sequence Alignment
  • Sequence Analysis, RNA

Substances

  • 5' Untranslated Regions
  • Codon, Initiator
  • RNA, Messenger