Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Nov 8;113(45):E7126-E7135.
doi: 10.1073/pnas.1614788113. Epub 2016 Oct 21.

Super-resolution ribosome profiling reveals unannotated translation events in Arabidopsis

Affiliations

Super-resolution ribosome profiling reveals unannotated translation events in Arabidopsis

Polly Yingshan Hsu et al. Proc Natl Acad Sci U S A. .

Abstract

Deep sequencing of ribosome footprints (ribosome profiling) maps and quantifies mRNA translation. Because ribosomes decode mRNA every 3 nt, the periodic property of ribosome footprints could be used to identify novel translated ORFs. However, due to the limited resolution of existing methods, the 3-nt periodicity is observed mostly in a global analysis, but not in individual transcripts. Here, we report a protocol applied to Arabidopsis that maps over 90% of the footprints to the main reading frame and thus offers super-resolution profiles for individual transcripts to precisely define translated regions. The resulting data not only support many annotated and predicted noncanonical translation events but also uncover small ORFs in annotated noncoding RNAs and pseudogenes. A substantial number of these unannotated ORFs are evolutionarily conserved, and some produce stable proteins. Thus, our study provides a valuable resource for plant genomics and an efficient optimization strategy for ribosome profiling in other organisms.

Keywords: Ribo-seq; ncRNA; ribosome footprint; sORF; translation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Identifying translated ORFs using ribosome-profiling data. (A) Experimental workflow of ribosome profiling and the expected read distribution among the reading frames. (B) Data analysis workflow for ORF finding using RiboTaper. (C) Our 28-nt ribosome footprints in the Arabidopsis root mapped to the annotated protein-coding genes in TAIR10. Results of other footprint length are shown in Fig. S4A. The inferred footprint positions related to the initiating and terminating ribosomes are shown. The A site (the entry point for the aminoacyl-tRNA), P site (where peptide-bond formation occurs), and E site (the exit site of the uncharged tRNA) within ribosomes are shown. A region of 63 nt near the start and stop codon is shown. The position of a ribosome footprint is indicated by its 13th nucleotide within each footprint. Three reading frames are shown in red (the main frame according to the annotated start codon), blue, and green. Most of the footprints are mapped within the CDS and show enrichment for the main reading frame. Footprints at the translation initiation and termination revealed that the ribosomal P site is located between the 13th and 15th nucleotides, whereas the A site is located between 16th and 18th nucleotides.
Fig. 2.
Fig. 2.
Comparison between the current study and published Arabidopsis ribosome-profiling datasets. (A) Length distribution of ribosome footprints in the current study (Hsu_root and Hsu_shoot), compared with three other published datasets in Arabidopsis (–27). See SI Materials and Methods for details of the growth conditions for each dataset. Size of footprints isolated in each dataset is compared in Table S1. (B) Percentage of Ribo-seq reads in the max reading frame. Data were extracted from the meta-gene analysis using 28-nt footprints in which most of the datasets display the best 3-nt periodicity. The gray line marks 33%, which is the percentage of reads expected if there is no enrichment in any frame. (C) Number of protein-coding genes with translated ORFs identified by RiboTaper with different sequencing depths. (D) Percentage of protein-coding genes with translated ORFs identified among the expressed protein-coding genes defined by different RNA expression cutoffs. A subset of each dataset (25 million reads) was compared across the studies.
Fig. 3.
Fig. 3.
Distinct profiles of annotated protein-coding genes and a well-characterized ncRNA. RNA-seq and P sites in ribosome footprints in root are shown for the following genes: (A) TUB4, a highly expressed gene; (B) HID1, a well-characterized ncRNA whose function is solely contributed by the RNA (31) and whose footprints do not display a clear 3-nt periodicity; the y axis is truncated to visualize low-abundance reads; (C) GLV6, a gene that encodes a secreted peptide with low expression levels and a short ORF; (D) AT3G10985, which uses an upstream CUG start codon (indicated by a black triangle). Annotated gene model and chromosome coordinates are indicated under each Ribo-seq profile. Within the gene model: gray box, 5′-UTR; black box, CDS; white arrow, 3′-UTR. Ribo-seq reads are shown by plotting their first nucleotide of the P site. Three reading frames are shown in red (the expected frame according to the predicted start codon), blue, and green. Footprints that are outside of the predicted coding sequences are shown in gray. The predicted start codon position is indicated by a black dashed line on each Ribo-seq profile panel; the predicted stop codon position is indicated by a gray dashed line.
Fig. 4.
Fig. 4.
uORFs and an unannotated ORF revealed by ribosome profiling. RNA-seq and P sites in ribosome footprints in root (A and B) or shoot (C) for the following genes: (A) CPuORF51 (orange box) and an unannotated uORF (yellow box) within the 5′-UTR of AT3G53670. (B) An unannotated ORF identified as a uORF within the 5′-UTR of AT5G17460 appears to be an ORF for an unannotated gene. The RNA-seq reads only cover a portion of the 5′-UTR of AT5G17460, suggesting the ORF identified (yellow box) represents the CDS of an unannotated gene, rather than a uORF of AT5G17460. (C) A uORF initiating at a non-AUG codon within the 5′-UTR of AT4G26850 in the shoot. The uORF is marked as a yellow box in the 5′-UTR; the previously reported start codon (ACG; ref. 44) is indicated by an empty triangle underneath. Gene model and data presentation are the same as described in the legend of Fig. 3.
Fig. 5.
Fig. 5.
Translated ORFs identified within annotated ncRNAs. (A–C) RNA-seq and P sites in ribosome footprints in root for three ORFs identified within annotated ncRNAs. The predicted CDS and 5′-UTR are depicted as black and gray boxes, respectively. The 3′-UTR is represented by a white arrow. Data presentation is the same as described in the legend of Fig. 3. (D) A schematic diagram of HA-tagged constructs and Western blot analysis of proteins produced by the three annotated ncRNAs in AC. Total protein in root of control plants (Col-0) and transgenic plants expressing individual HA-tagged proteins was isolated and analyzed with either anti-HA antibody or anti-UGPase antibody as a loading control.
Fig. 6.
Fig. 6.
Representative sequence alignments of unannotated ORFs in A. thaliana with corresponding homologs in 15 other plants. (A) An ORF identified in an annotated ncRNA. (B) An ORF identified in an unannotated gene overlapping with AT5G17460 (denoted as AT5G17460x; also known as sORF32). (C) An ORF identified in a pseudogene. If there are multiple homologs identified in one genome, the homolog with the highest sequence identity to A. thaliana is shown. Amino acids with the same functional groups are shown in similar colors. Note that all these protein sequences have very similar start (the left-most methionine) and stop positions (X).
Fig. 7.
Fig. 7.
Homolog sequence identities of translated ORFs found in annotated ncRNAs or in an unannotated gene. A heat map showing amino acid sequence identities between translated ORFs within annotated ncRNAs/unannotated gene (sORF32) in A. thaliana and their corresponding homologs in 15 other plant species. A phylogenetic tree showing evolutionary divergence is on the Left. One homolog with the best sequence identity in each genome is represented here. The ORFs can be further grouped based on their homologs identified in other species (I to VI).

Similar articles

Cited by

References

    1. King HA, Gerber AP. Translatome profiling: Methods for genome-scale analysis of mRNA translation. Brief Funct Genomics. 2016;15(1):22–31. - PubMed
    1. Ingolia NT, Ghaemmaghami S, Newman JRS, Weissman JS. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science. 2009;324(5924):218–223. - PMC - PubMed
    1. Ingolia NT, Brar GA, Rouskin S, McGeachy AM, Weissman JS. The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments. Nat Protoc. 2012;7(8):1534–1550. - PMC - PubMed
    1. Brar GA, et al. High-resolution view of the yeast meiotic program revealed by ribosome profiling. Science. 2012;335(6068):552–557. - PMC - PubMed
    1. Stern-Ginossar N, et al. Decoding human cytomegalovirus. Science. 2012;338(6110):1088–1093. - PMC - PubMed