Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul 11;499(7457):172-7.
doi: 10.1038/nature12311.

A Compendium of RNA-binding Motifs for Decoding Gene Regulation

Free PMC article

A Compendium of RNA-binding Motifs for Decoding Gene Regulation

Debashish Ray et al. Nature. .
Free PMC article


RNA-binding proteins are key regulators of gene expression, yet only a small fraction have been functionally characterized. Here we report a systematic analysis of the RNA motifs recognized by RNA-binding proteins, encompassing 205 distinct genes from 24 diverse eukaryotes. The sequence specificities of RNA-binding proteins display deep evolutionary conservation, and the recognition preferences for a large fraction of metazoan RNA-binding proteins can thus be inferred from their RNA-binding domain sequence. The motifs that we identify in vitro correlate well with in vivo RNA-binding data. Moreover, we can associate them with distinct functional roles in diverse types of post-transcriptional regulation, enabling new insights into the functions of RNA-binding proteins both in normal physiology and in human disease. These data provide an unprecedented overview of RNA-binding proteins and their targets, and constitute an invaluable resource for determining post-transcriptional regulatory mechanisms in eukaryotes.


Figure 1
Figure 1. RNAcompete data for 207 RBPs
a, 7-mer Z scores and motifs for the two probe sets for ZC3H10. b, Two-dimensional hierarchical clustering analysis (Pearson correlation, average linkage) of E scores for 7-mers with E ≥ 0.4 in at least one experiment, with the two halves of the array kept as separate rows. Long systematic names have been shortened to species abbreviations and RNAcompete assay numbers. c, ROC curves showing discrimination of bound and unbound RNAs by the corresponding protein in vivo. The curve with the highest AUROC is shown if there are multiple in vivo data sets for a protein. FUS and TAF15 were excluded.
Figure 2
Figure 2. Motifs obtained by RNAcompete for RRM (outer ring) and KH domain proteins (inner ring)
The dendrograms represent complete linkage hierarchical clustering of RBPs by amino acid sequence identity in their RBDs. Line colours indicate species of origin of each protein, and shading indicates clades in which all sequences are more than 70% (dark) or 50% (light) identical.
Figure 3
Figure 3. RBD sequence identity enables inference of RNA motifs
a, Motif similarity versus per cent amino acid sequence identity in all RBDs for pairs of proteins. Motif similarity scored using STAMP Pearson-based log10(E value), correlation between PFM affinity scores against 10,000 random-sequence 100-mers, or human 3′ UTRs (for human RBPs). Columns indicate average; error bars indicate standard deviation. Red points: new proteins analysed (see c). b, Stacked bars indicate proportion of each category of RBP encompassed by experimentally determined motifs or inferred motifs using stringent (RNAcompete motifs, ≥70% identity) or expanded criteria (RNAcompete and literature motifs, ≥50% identity) in 288 eukaryotes (Supplementary Data 9). ‘Multi-RBD’ and “All” indicate proteins with >1 or >0 RBDs, respectively. c, Validation of motifs predicted for proteins at 61–96% amino acid identity (red text indicates validation motifs).
Figure 4
Figure 4. Conservation of motif matches in human RNA regulatory regions
a, Heat map showing conservation in 50-nucleotide bins (columns) in regions indicated at the top of the panel. Rows represent the most significant motif for indicated protein family (see Supplementary Table 4). Box fill: conservation score of the most conserved position in the motif for each bin. Border colour: conservation score when the entire regulatory region is considered as a single bin. Asterisks indicate known splicing factors. b, Alignment of vertebrate sequences over the ESRP1/2 site in the USF1 3′ UTR. Sequence logos are shown for major branches of vertebrate taxonomy. Dashed box: motif derived from the full alignment. The RNAcompete motif for ESRP1/2 is shown to the right.
Figure 5
Figure 5. RBFOX1 is a putative regulator of RNA stability in autism
a, Significance (as rank-sum Z score) of bias that RBP motifs in 3′ UTRs of mRNAs confer towards correlated expression with the RBP’s mRNA (FDR <0.1). b, Scatter plot shows Z score (from a) versus rank-sum Z score of the same target set, with mRNAs ranked instead by decay rate in MDA-MB-231 cells, for expressed RBPs. c, Enrichment of predicted RBFOX1 stability targets (by ‘leading-edge’ analysis) among transcripts with conserved RBFOX1 motifs. d, Density plot showing that RBFOX1 targets are enriched among transcripts most affected by RBFOX1 RNAi. e, Relationship of mRNA expression levels in autism spectrum disorder brains to RBFOX1 expression and predicted RBFOX1 target status.

Similar articles

See all similar articles

Cited by 505 articles

See all "Cited by" articles

Publication types

MeSH terms

Associated data

LinkOut - more resources