Extensive identification and analysis of conserved small ORFs in animals
- PMID: 26364619
- PMCID: PMC4568590
- DOI: 10.1186/s13059-015-0742-x
Extensive identification and analysis of conserved small ORFs in animals
Abstract
Background: There is increasing evidence that transcripts or transcript regions annotated as non-coding can harbor functional short open reading frames (sORFs). Loss-of-function experiments have identified essential developmental or physiological roles for a few of the encoded peptides (micropeptides), but genome-wide experimental or computational identification of functional sORFs remains challenging.
Results: Here, we expand our previously developed method and present results of an integrated computational pipeline for the identification of conserved sORFs in human, mouse, zebrafish, fruit fly, and the nematode C. elegans. Isolating specific conservation signatures indicative of purifying selection on amino acid (rather than nucleotide) sequence, we identify about 2,000 novel small ORFs located in the untranslated regions of canonical mRNAs or on transcripts annotated as non-coding. Predicted sORFs show stronger conservation signatures than those identified in previous studies and are sometimes conserved over large evolutionary distances. The encoded peptides have little homology to known proteins and are enriched in disordered regions and short linear interaction motifs. Published ribosome profiling data indicate translation of more than 100 novel sORFs, and mass spectrometry data provide evidence for more than 70 novel candidates.
Conclusions: Taken together, we identify hundreds of previously unknown conserved sORFs in major model organisms. Our computational analyses and integration with experimental data show that these sORFs are expressed, often translated, and sometimes widely conserved, in some cases even between vertebrates and invertebrates. We thus provide an integrated resource of putatively functional micropeptides for functional validation in vivo.
Figures
Comment in
-
Finding smORFs: getting closer.Genome Biol. 2015 Sep 14;16(1):189. doi: 10.1186/s13059-015-0765-3. Genome Biol. 2015. PMID: 26364669 Free PMC article.
Similar articles
-
Combining in silico prediction and ribosome profiling in a genome-wide search for novel putatively coding sORFs.BMC Genomics. 2013 Sep 23;14:648. doi: 10.1186/1471-2164-14-648. BMC Genomics. 2013. PMID: 24059539 Free PMC article.
-
Ultra-deep sequencing of ribosome-associated poly-adenylated RNA in early Drosophila embryos reveals hundreds of conserved translated sORFs.DNA Res. 2016 Dec;23(6):571-580. doi: 10.1093/dnares/dsw040. Epub 2016 Aug 24. DNA Res. 2016. PMID: 27559081 Free PMC article.
-
A large number of novel coding small open reading frames in the intergenic regions of the Arabidopsis thaliana genome are transcribed and/or under purifying selection.Genome Res. 2007 May;17(5):632-40. doi: 10.1101/gr.5836207. Epub 2007 Mar 29. Genome Res. 2007. PMID: 17395691 Free PMC article.
-
Revisiting sORFs: overcoming challenges to identify and characterize functional microproteins.FEBS J. 2022 Jan;289(1):53-74. doi: 10.1111/febs.15769. Epub 2021 Feb 24. FEBS J. 2022. PMID: 33595896 Review.
-
Mining for Small Translated ORFs.J Proteome Res. 2018 Jan 5;17(1):1-11. doi: 10.1021/acs.jproteome.7b00707. Epub 2017 Dec 11. J Proteome Res. 2018. PMID: 29188713 Review.
Cited by
-
Long non-coding RNAs in cardiac hypertrophy and heart failure: functions, mechanisms and clinical prospects.Nat Rev Cardiol. 2024 May;21(5):326-345. doi: 10.1038/s41569-023-00952-5. Epub 2023 Nov 20. Nat Rev Cardiol. 2024. PMID: 37985696 Free PMC article. Review.
-
Coding or Noncoding, the Converging Concepts of RNAs.Front Genet. 2019 May 22;10:496. doi: 10.3389/fgene.2019.00496. eCollection 2019. Front Genet. 2019. PMID: 31178900 Free PMC article. Review.
-
Pegasus, a small extracellular peptide enhancing short-range diffusion of Wingless.Nat Commun. 2021 Sep 27;12(1):5660. doi: 10.1038/s41467-021-25785-z. Nat Commun. 2021. PMID: 34580289 Free PMC article.
-
Exhaustive identification of conserved upstream open reading frames with potential translational regulatory functions from animal genomes.Sci Rep. 2020 Oct 1;10(1):16289. doi: 10.1038/s41598-020-73307-6. Sci Rep. 2020. PMID: 33004976 Free PMC article.
-
Machine Learning in Agriculture: A Review.Sensors (Basel). 2018 Aug 14;18(8):2674. doi: 10.3390/s18082674. Sensors (Basel). 2018. PMID: 30110960 Free PMC article. Review.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
