Exhaustive identification of conserved upstream open reading frames with potential translational regulatory functions from animal genomes

Sci Rep. 2020 Oct 1;10(1):16289. doi: 10.1038/s41598-020-73307-6.


Upstream open reading frames (uORFs) are present in the 5'-untranslated regions of many eukaryotic mRNAs, and some peptides encoded by these regions play important regulatory roles in controlling main ORF (mORF) translation. We previously developed a novel pipeline, ESUCA, to comprehensively identify plant uORFs encoding functional peptides, based on genome-wide identification of uORFs with conserved peptide sequences (CPuORFs). Here, we applied ESUCA to diverse animal genomes, because animal CPuORFs have been identified only by comparing uORF sequences between a limited number of species, and how many previously identified CPuORFs encode regulatory peptides is unclear. By using ESUCA, 1517 (1373 novel and 144 known) CPuORFs were extracted from four evolutionarily divergent animal genomes. We examined the effects of 17 human CPuORFs on mORF translation using transient expression assays. Through these analyses, we identified seven novel regulatory CPuORFs that repressed mORF translation in a sequence-dependent manner, including one conserved only among Eutheria. We discovered a much higher number of animal CPuORFs than previously identified. Since most human CPuORFs identified in this study are conserved across a wide range of Eutheria or a wider taxonomic range, many CPuORFs encoding regulatory peptides are expected to be found in the identified CPuORFs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Chickens / genetics
  • Conserved Sequence / genetics*
  • Drosophila melanogaster / genetics
  • Gene Expression Regulation / genetics*
  • Genome / genetics
  • Humans
  • Open Reading Frames / genetics*
  • Protein Biosynthesis / genetics
  • Zebrafish / genetics