Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Feb 15;482(7385):339-46.
doi: 10.1038/nature10887.

Modular Regulatory Principles of Large Non-Coding RNAs

Free PMC article

Modular Regulatory Principles of Large Non-Coding RNAs

Mitchell Guttman et al. Nature. .
Free PMC article


It is clear that RNA has a diverse set of functions and is more than just a messenger between gene and protein. The mammalian genome is extensively transcribed, giving rise to thousands of non-coding transcripts. Whether all of these transcripts are functional is debated, but it is evident that there are many functional large non-coding RNAs (ncRNAs). Recent studies have begun to explore the functional diversity and mechanistic role of these large ncRNAs. Here we synthesize these studies to provide an emerging model whereby large ncRNAs might achieve regulatory specificity through modularity, assembling diverse combinations of proteins and possibly RNA and DNA interactions.


Figure 1
Figure 1. Layering of genomic regions
a, Genomic regions are colour-coded by the presence of different genomic annotations. RNA transcription of a locus (grey), K4–K36 chromatin signature (red), K4me1 modification and transcriptional activator p300 (green) and protein-coding potential (blue). By overlaying this information, distinct transcripts are revealed, including ncRNAs (red), protein-coding genes (purple) and transcripts from enhancer regions (green). b, A cross-species alignment of a coding and a non-coding gene. Boxes represent codons, and each row represents a different aligned species. Blue boxes represent mutations that cause a synonymous substitution, and red boxes represent mutations that cause a non-synonymous substitution. A score capturing the coding potential of a sequence across species aligns sequences in all frames and scores mutations that maintain coding potential (blue boxes) relative to mutations that break coding potential (that is, non-synonymous mutations, stop codons and frameshifting insertions or deletions) (red boxes). c, The coding potential score is shown for three gene types, SIRT1 (a protein-coding gene), XIST (ncRNA gene) and tarsal-less (small-peptide coding gene), in which positive scores represent coding regions (blue) and negative scores represent non-coding regions (red). In each example, the gene structure is shown, where blue boxes represent known protein-coding exons and red boxes represent non-coding exons. SIRT1 with an ORF length of 576 amino acids (aa) contains a positive score over each coding exon but not the non-coding regions. XIST with an ORF length of 172 amino acids contains negative scores over the entire transcribed region. tarsal-less with an ORF of 11 and 32 amino acids, contains positive scores over all known small peptides.
Figure 2
Figure 2. Classification of ncRNA function
a, Illustration of an ncRNA with expression patterns related to the NFκB pathway. Each row represents a gene, and a positive association (red box) is assigned between the ncRNA and the pathway based on the correlation of the genes in the process. Similarly, the ncRNA is assigned negative association (blue box) with the p53 pathway based on anticorrelation with the genes in the process. b, The scores for each functional term and ncRNA can be clustered to identify classes of ncRNAs. In this example (adapted, with permission, from ref. 25) each column represents a different ncRNA, and each row represents a different functional term. c, A model of ncRNAs that have a cis-function by remaining tethered to their site of transcription. In this model, RNA polymerase (green) transcribes an RNA (red), which can associate with regulatory proteins (purple) to affect neighbouring regions, as proposed for XIST,. d, One model for ncRNA trans-regulation. In this model an ncRNA can associate with DNA-binding proteins (blue) and regulatory proteins to localize and affect the expression of the targets, as proposed for HOTAIR. e, A model for ncRNAs that bind regulatory proteins and change their activity, in this case leading to a change in modification state and expression of the target gene, as proposed for the CCND1 ncRNAs, which interact with the TLS protein. f, A model for ncRNAs that act as ‘decoys’. In this model, ncRNAs bind protein complexes and prevent them from binding to their proper regulatory targets, as proposed for GAS5 and PANDA.
Figure 3
Figure 3. Modular principles of large ncRNA genes
a, The four principles of nucleic acid and protein interactions. (1) RNA–protein interactions, (2) DNA–RNA hybridization-based interactions, (3) DNA–protein interactions and (4) RNA–RNA hybridization based interactions. b, Each of these principles can be combined to build distinct complexes. For example, combining RNA– protein and RNA–DNA interactions can localize a protein complex to a specific DNA sequence in an RNA-dependent manner; as has been implicated for the DHFR promoter and localization of DNMT3b. Combining RNA–protein and protein–DNA principles can also localize a diverse set of proteins, which have a molecular scaffold created by RNA, to a specific DNA sequence in a protein-dependent manner. The ribosome is a multifaceted combination of RNA–protein interactions that facilitate correct RNA–RNA interactions for the ribozyme activity of the ribosome. The telomere replication activity of telomerase is an example of combining RNA–protein, RNA–DNA and protein–DNA interactions.

Similar articles

See all similar articles

Cited by 768 articles

See all "Cited by" articles

MeSH terms