Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Nov;25(11):1610-21.
doi: 10.1101/gr.193342.115. Epub 2015 Aug 21.

Integrative analysis of RNA, translation, and protein levels reveals distinct regulatory variation across humans

Affiliations

Integrative analysis of RNA, translation, and protein levels reveals distinct regulatory variation across humans

Can Cenik et al. Genome Res. 2015 Nov.

Abstract

Elucidating the consequences of genetic differences between humans is essential for understanding phenotypic diversity and personalized medicine. Although variation in RNA levels, transcription factor binding, and chromatin have been explored, little is known about global variation in translation and its genetic determinants. We used ribosome profiling, RNA sequencing, and mass spectrometry to perform an integrated analysis in lymphoblastoid cell lines from a diverse group of individuals. We find significant differences in RNA, translation, and protein levels suggesting diverse mechanisms of personalized gene expression control. Combined analysis of RNA expression and ribosome occupancy improves the identification of individual protein level differences. Finally, we identify genetic differences that specifically modulate ribosome occupancy--many of these differences lie close to start codons and upstream ORFs. Our results reveal a new level of gene expression variation among humans and indicate that genetic variants can cause changes in protein levels through effects on translation.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Choice of RNase is critical for generating ribosome profiling data. (A) A schematic representation of the ribosome profiling strategy is shown. A key step is the digestion of unprotected RNA segments with an RNase. The ribosome-protected RNA segments are isolated using a sucrose cushion and prepared for high-throughput sequencing. (B) Human lymphoblastoid cells (GM12878) were lysed in the presence of cycloheximide. The samples were ultracentrifuged through a 10%–50% sucrose gradient. Samples were fractionated while continuously monitoring absorbance at 254 nm. A representative polysome profile is shown. (C) Samples were prepared for ultracentrifugation as in B with the following exception: The cleared lysate was incubated with 100 units of RNase I (Ambion) for 30 min at RT before the ultracentrifugation. (D) Samples were prepared as in B, except 300 units of RNase T1 (Fermentas) and 500 ng of RNase A (Ambion) were used for the RNase digestion step. A complete digestion of polysomes into monosomes was observed. (E) Schematic representation of the data sets used in the current study. Genotype, ribosome profiling, RNA-seq, and mass spectrometry-based proteomics data were collected from lymphoblastoid cells derived from a diverse group of 30 individuals.
Figure 2.
Figure 2.
Ribosome occupancy correlates better with absolute protein levels than RNA expression and protein levels. (A) A self-organizing map (SOM) was trained using ribosome occupancy, RNA expression, translation efficiency (TE), and protein levels. These measurements were converted into their relative rank order before training. After training, each neuron in the SOM contains several genes sharing similar expression patterns. (B) Four different colorings of the trained SOM depict the mean ribosome occupancy, RNA expression, translation efficiency, or protein levels for each neuron. (C) Neurons of the SOM were grouped using affinity propagation clustering (Frey and Dueck 2007). Shared coloring between nodes indicates membership to the same cluster. For each cluster, the mean rank in ribosome occupancy (RO), RNA expression (RE), translation efficiency (TE), and protein level (PL) was shown for the representative neuron of the cluster. The number of genes in each cluster (n) is shown. (D) For four of nine clusters, significantly enriched gene ontology (GO) terms were identified (FuncAssociate; permutation-based corrected P-value < 0.05) (Supplemental Table S1; Berriz et al. 2009). For Clusters 5 and 8, selected GO categories were shown (log2 odds ratio). Supplemental Table S1 contains the full list of enriched terms.
Figure 3.
Figure 3.
Identification of genes with significant inter-individual variability in RNA expression and ribosome occupancy improves the ability to identify personal differences in protein levels. (A) Ribosome occupancy and RNA expression was modeled using a linear mixed model treating individuals as a random effect and mean expression as the fixed effect. A simulation-based exact likelihood ratio test (Scheipl et al. 2008) was used to compare the linear mixed model to a linear model that did not include the individual as a predictor. The number of genes that show significant inter-individual in RNA expression or inter-individual variation in ribosome occupancy is plotted (Holm's corrected P-value < 0.05). (B) The Venn diagram depicts the overlap between the two groups. (C) Enriched gene ontology (GO) terms among genes with significant inter-individual variation in both RNA expression and ribosome occupancy was determined using FuncAssociate (Berriz et al. 2009). Cytoscape (Smoot et al. 2011) was used to visualize the enriched GO terms (permutation test corrected P-value < 0.05, odds ratio > 3) (Supplemental Table S2). Nodes correspond to GO terms and are colored by the corrected P-value. The size of the node is proportional to the logarithm of the odds ratio. The similarity between GO terms was quantified using Kappa similarity. The strength of the similarity was visualized using darker edge colors (Supplemental Methods). An edge-weighted spring embedded layout is shown. (D) For each gene, Spearman correlation was calculated between individual specific RNA expression and relative protein levels. The distribution of the correlation coefficients was plotted as a density. Genes that showed significant variation in both RNA expression and ribosome occupancy between individuals are plotted with red bars and genes without detectable variation in RNA expression and ribosome occupancy are shown with white bars.
Figure 4.
Figure 4.
Nucleotide variants that modify upstream ORFs can alter ribosome occupancy of the main coding region. (A) We identified single nucleotide polymorphisms that generate, delete, or otherwise modify an upstream open reading frame (uORF). We tested whether changes to uORFs affected ribosome occupancy of the main coding region using a linear regression framework. The absolute value of the effect size from the regression was plotted against the P-value of association. For 17 uORF changes shown with red circles, the association was solely with ribosome occupancy (nominal P-value > 0.05 or opposite signed regression coefficients for RNA expression). Supplemental Table S5 shows the robustness to population stratification and linear mixed model. (B) A SNP in the 5′ UTR of the LENG8 gene introduces a premature in-frame stop codon that shortens an existing uORF. This event results in lower ribosome occupancy of the main coding region, as shown in the boxplot (pRibo = 0.002). The horizontal bar reflects the median of the distribution, and the box depicts the interquartile range. The whiskers are drawn to 1.5 times the interquartile range. (C) In another example, SRRM1, a SNP completely eliminates an existing uORF by removing its start codon. The loss of this uORF is associated with reduced ribosome occupancy of the main coding region (pRibo = 0.0004; pRNA = 0.19). (D) The reference sequence of ZNF215 gene has two short uORFs. Two different genetic variants eliminate the stop codon of the first uORF (UGA to UAC or UGA to CAA), resulting in merging of the two short uORFs into a single long uORF. The merging of the uORF significantly modulates both ribosome occupancy and RNA expression (pRibo = 0.0001 and pRNA = 10−9, respectively).
Figure 5.
Figure 5.
Nucleotide variants modulating the sequence around the translation initiation site alter translation efficiency. (A) The Kozak region is defined as the 6 nt preceding and 2 nt following the start codon. The derived position weight matrix was visualized using WebLogo (Crooks et al. 2004). The upper panel shows the effects of each nucleotide at the −3 position on translation efficiency. The effect of nucleotides on translation efficiency was tested using the Kruskal–Wallis test. (B) The effect of a Kozak region variant on the ribosome occupancy of NTPCR was assessed using a linear model (P-value = 1.1 × 10−6). A boxplot was used to visualize the distribution of ribosome occupancy for individuals with given genotypes. The horizontal bar reflects the median of the distribution and the box is drawn to depict the interquartile range. (C) WDR11 had two naturally occurring SNPs in its Kozak region. An additive model was adopted to calculate the change in the position weight matrix score of the Kozak region. (D) 5′ UTRs with or without Kozak variants were cloned into a translation efficiency reporter. The reporter expresses a biscistronic mRNA, in which the Renilla luciferase is translated under the control of the cloned 5′ UTR, and the firefly luciferase is translated under the control of Hepatitis C virus (HCV) internal ribosome entry site (IRES). (E,F) The ratio of Renilla to firefly luciferase activity was plotted for NTPCR (E) and WDR11 (F). Error bars represent SEM. The difference between the ratios was assessed using a two-sided two-sample t-test. (*) P-value < 0.05.

Similar articles

Cited by

References

    1. The 1000 Genomes Project Consortium. 2012. An integrated map of genetic variation from 1,092 human genomes. Nature 491: 56–65. - PMC - PubMed
    1. Albert FW, Muzzey D, Weissman JS, Kruglyak L. 2014. Genetic influences on translation in yeast. PLoS Genet 10: e1004692. - PMC - PubMed
    1. Artieri CG, Fraser HB. 2014. Evolution at two levels of gene expression in yeast. Genome Res 24: 411–421. - PMC - PubMed
    1. Barbosa C, Peixeiro I, Romão L. 2013. Gene expression regulation by upstream open reading frames and human disease. PLoS Genet 9: e1003529. - PMC - PubMed
    1. Battle A, Mostafavi S, Zhu X, Potash JB, Weissman MM, McCormick C, Haudenschild CD, Beckman KB, Shi J, Mei R, et al. 2014. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res 24: 14–24. - PMC - PubMed

Publication types

Associated data

LinkOut - more resources