Genome-wide search for novel human uORFs and N-terminal protein extensions using ribosomal footprinting

Genome Res. 2012 Nov;22(11):2208-18. doi: 10.1101/gr.139568.112. Epub 2012 Aug 9.


So far, the annotation of translation initiation sites (TISs) has been based mostly upon bioinformatics rather than experimental evidence. We adapted ribosomal footprinting to puromycin-treated cells to generate a transcriptome-wide map of TISs in a human monocytic cell line. A neural network was trained on the ribosomal footprints observed at previously annotated AUG translation initiation codons (TICs), and used for the ab initio prediction of TISs in 5062 transcripts with sufficient sequence coverage. Functional interpretation suggested 2994 novel upstream open reading frames (uORFs) in the 5' UTR, 1406 uORFs overlapping with the coding sequence, and 546 N-terminal protein extensions. The TIS detection method was validated on the basis of previously published alternative TISs and uORFs. Among primates, TICs in newly annotated TISs were significantly more conserved than control codons, both for AUGs and near-cognate codons. The transcriptome-wide map of novel candidate TISs derived as part of the study will shed further light on the way in which human proteome diversity is influenced by alternative translation initiation and regulation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • 5' Untranslated Regions / genetics
  • Binding Sites
  • Cell Line
  • Codon, Initiator / genetics
  • Codon, Initiator / metabolism
  • DNA, Complementary / chemistry
  • Genome, Human*
  • Humans
  • Open Reading Frames / genetics*
  • Peptide Chain Initiation, Translational / genetics*
  • Puromycin
  • Ribosomes / metabolism*
  • Sequence Analysis, DNA
  • Transcriptome


  • 5' Untranslated Regions
  • Codon, Initiator
  • DNA, Complementary
  • Puromycin