Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 485 (7397), 264-8

Systematic Discovery of Structural Elements Governing Stability of Mammalian Messenger RNAs


Systematic Discovery of Structural Elements Governing Stability of Mammalian Messenger RNAs

Hani Goodarzi et al. Nature.


Decoding post-transcriptional regulatory programs in RNA is a critical step towards the larger goal of developing predictive dynamical models of cellular behaviour. Despite recent efforts, the vast landscape of RNA regulatory elements remains largely uncharacterized. A long-standing obstacle is the contribution of local RNA secondary structure to the definition of interaction partners in a variety of regulatory contexts, including--but not limited to--transcript stability, alternative splicing and localization. There are many documented instances where the presence of a structural regulatory element dictates alternative splicing patterns (for example, human cardiac troponin T) or affects other aspects of RNA biology. Thus, a full characterization of post-transcriptional regulatory programs requires capturing information provided by both local secondary structures and the underlying sequence. Here we present a computational framework based on context-free grammars and mutual information that systematically explores the immense space of small structural elements and reveals motifs that are significantly informative of genome-wide measurements of RNA behaviour. By applying this framework to genome-wide human mRNA stability data, we reveal eight highly significant elements with substantial structural information, for the strongest of which we show a major role in global mRNA regulation. Through biochemistry, mass spectrometry and in vivo binding studies, we identified human HNRPA2B1 (heterogeneous nuclear ribonucleoprotein A2/B1, also known as HNRNPA2B1) as the key regulator that binds this element and stabilizes a large number of its target genes. We created a global post-transcriptional regulatory map based on the identity of the discovered linear and structural cis-regulatory elements, their regulatory interactions and their target pathways. This approach could also be used to reveal the structural elements that modulate other aspects of RNA behaviour.


Figure 1
Figure 1. Discovery of RNA structural motifs informative of genome wide transcript stability
Each RNA structural motif is shown along with its pattern of enrichment/depletion across the range of mRNA stability measurements throughout the genome. The transcripts are partitioned into equally populated bins based on their stability measures, going from left (highly stable) to right (unstable). In the heatmap representation, a gold entry marks the enrichment of the given motif in its corresponding stability bin (measured by log-transformed hypergeometric p-values), while a light-blue entry indicates motif depletion in the bin. Red and blue borders mark highly significant motif enrichments and depletions, respectively. Included are the motif names, their location (UP for 5’UTR and DN for 3’UTR), their sequence information (in the form of a logo) and their frequency (the fraction of transcripts that carry at least one instance of the motif). Also shown are the associated mutual information values. Each mutual information (MI) value is used to calculate a z-score, which is the number of standard-deviations of the actual MI relative to MI's calculated for 1.5 million randomly shuffled stability profiles. A structural illustration of each motif is also presented using the following single letter nucleotide code: Y=[UC], R=[AG], K=[UG], M=[AC], S=[GC], W=[AU], B=[GUC], D=[GAU], H=[ACU], V=[GCA] and N=any nucleotide.
Figure 2
Figure 2. The regulatory role of sRSM1
Whole-genome expression levels were measured in decoy-transfected samples relative to the controls transfected with scrambled RNA molecules (see Methods). The measurements were performed in duplicate, for two independent decoy/scrambled sets (the relative transcript levels were subsequently averaged across the two replicates in each set). Genes were sorted and quantized into equally populated bins based on the average log-ratio of their expression levels in the decoy samples relative to the scrambled controls. TEISER was used to show the enrichment/depletion patterns of transcripts harboring sRSM1 in their 3’ UTRs. Mutual information values and the associated z-scores are also presented.
Figure 3
Figure 3. HNRPA2B1 stabilizes transcripts through direct in vivo binding to sRSM1 structural motifs
a, Genome-wide expression levels were measured in HNRPA2B1 siRNA-transfected samples relative to mock-transfected controls. TEISER was used to capture the enrichment/depletion pattern of transcripts carrying sRSM1 across the relative expression values. Experiments were performed in triplicate, each with an independent siRNA targeting HNRPA2B1 and the resulting log ratios were averaged for each transcript. b, Transcript decay rates were compared in HNRPA2B1 knock-downs versus mock-transfected controls. These measurements were then analyzed by TEISER to visualize the extent to which the decay rates of transcripts carrying sRSM1 elements were increased following HNRPA2B1 knock-down. c, Using UV-crosslinking followed by immunoprecipitation, mRNAs that bind HNRPA2B1 were extracted and compared against the input mRNA population (RIP-chip). The log ratio calculated for each mRNA denotes its abundance in the immunoprecipitated sample relative to the input control. Bins to the right contain the mRNAs that were captured as interacting partners with HNRPA2B1. Similar to the prior examples, TEISER was used to show the enrichment/depletion pattern of transcripts carrying sRSM1 in their 3’ UTRs. The values associated with each transcript were calculated as the average of log ratios from biological replicates. d, HNRPA2B1 binding sites were identified using immunoprecipitation followed by high-throughput sequencing (HITS-CLIP). Instances of the sRSM1 element are significantly enriched in these sites relative to a population of random sequences from 3’ UTRs that are not represented in the sequenced population.
Figure 4
Figure 4. HNRPA2B1 regulates growth rate
a, Whole genome expression levels across five breast cancer cell lines (MCF7, MDA-MB-231, HS578T, BT-549 and T47D) were correlated against their doubling times. The resulting values, ranging from −1 to 1, were analyzed by TEISER to probe the enrichment/depletion pattern of transcripts carrying sRSM1. b, The growth of HNRPA2B1 siRNA-transfected samples was compared to those of mock-transfected controls. For each time-point, the number of cells in four independent samples was counted in duplicates (n=8), yielding an estimated growth-rate (α). Shown are the average log-ratios, their standard deviation at each time-point, and the statistical significance of the observed difference in growth-rate.

Similar articles

See all similar articles

Cited by 70 articles

See all "Cited by" articles


    1. Dolken L, et al. High-resolution gene expression profiling for simultaneous kinetic parameter analysis of RNA synthesis and decay. Rna. 2008;14:1959–1972. - PMC - PubMed
    1. Elemento O, Slonim N, Tavazoie S. A universal framework for regulatory element discovery across all Genomes and data types. Mol Cell. 2007;28:337–350. - PMC - PubMed
    1. Rabani M, Kertesz M, Segal E. Computational prediction of RNA structural motifs involved in posttranscriptional regulatory processes. P Natl Acad Sci USA. 2008;105:14885–14890. - PMC - PubMed
    1. Barash Y, et al. Deciphering the splicing code. Nature. 2010;465:53–59. - PubMed
    1. Wan Y, Kertesz M, Spitale RC, Segal E, Chang HY. Understanding the transcriptome through RNA structure. Nat Rev Genet. 2011;12:641–655. - PMC - PubMed

Publication types

MeSH terms

Associated data