Direct detection of alternative open reading frames translation products in human significantly expands the proteome

PLoS One. 2013 Aug 12;8(8):e70698. doi: 10.1371/journal.pone.0070698. eCollection 2013.

Abstract

A fully mature mRNA is usually associated to a reference open reading frame encoding a single protein. Yet, mature mRNAs contain unconventional alternative open reading frames (AltORFs) located in untranslated regions (UTRs) or overlapping the reference ORFs (RefORFs) in non-canonical +2 and +3 reading frames. Although recent ribosome profiling and footprinting approaches have suggested the significant use of unconventional translation initiation sites in mammals, direct evidence of large-scale alternative protein expression at the proteome level is still lacking. To determine the contribution of alternative proteins to the human proteome, we generated a database of predicted human AltORFs revealing a new proteome mainly composed of small proteins with a median length of 57 amino acids, compared to 344 amino acids for the reference proteome. We experimentally detected a total of 1,259 alternative proteins by mass spectrometry analyses of human cell lines, tissues and fluids. In plasma and serum, alternative proteins represent up to 55% of the proteome and may be a potential unsuspected new source for biomarkers. We observed constitutive co-expression of RefORFs and AltORFs from endogenous genes and from transfected cDNAs, including tumor suppressor p53, and provide evidence that out-of-frame clones representing AltORFs are mistakenly rejected as false positive in cDNAs screening assays. Functional importance of alternative proteins is strongly supported by significant evolutionary conservation in vertebrates, invertebrates, and yeast. Our results imply that coding of multiple proteins in a single gene by the use of AltORFs may be a common feature in eukaryotes, and confirm that translation of unconventional ORFs generates an as yet unexplored proteome.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alternative Splicing*
  • Amino Acid Sequence
  • BRCA1 Protein / chemistry
  • BRCA1 Protein / genetics
  • BRCA1 Protein / metabolism
  • Cell Line
  • Computational Biology / methods
  • Databases, Genetic
  • Gene Expression
  • Humans
  • Molecular Sequence Data
  • Open Reading Frames*
  • Peptide Chain Initiation, Translational
  • Protein Binding
  • Protein Biosynthesis*
  • Proteome*
  • Proteomics* / methods
  • Reproducibility of Results
  • Sequence Alignment
  • Transfection

Substances

  • BRCA1 Protein
  • Proteome