Evidence for widespread association of mammalian splicing and conserved long-range RNA structures

RNA. 2012 Jan;18(1):1-15. doi: 10.1261/rna.029249.111. Epub 2011 Nov 29.

Abstract

Pre-mRNA structure impacts many cellular processes, including splicing in genes associated with disease. The contemporary paradigm of RNA structure prediction is biased toward secondary structures that occur within short ranges of pre-mRNA, although long-range base-pairings are known to be at least as important. Recently, we developed an efficient method for detecting conserved RNA structures on the genome-wide scale, one that does not require multiple sequence alignments and works equally well for the detection of local and long-range base-pairings. Using an enhanced method that detects base-pairings at all possible combinations of splice sites within each gene, we now report RNA structures that could be involved in the regulation of splicing in mammals. Statistically, we demonstrate strong association between the occurrence of conserved RNA structures and alternative splicing, where local RNA structures are generally more frequent at alternative donor splice sites, while long-range structures are more associated with weak alternative acceptor splice sites. As an example, we validated the RNA structure in the human SF1 gene using minigenes in the HEK293 cell line. Point mutations that disrupted the base-pairing of two complementary boxes between exons 9 and 10 of this gene altered the splicing pattern, while the compensatory mutations that reestablished the base-pairing reverted splicing to that of the wild-type. There is statistical evidence for a Dscam-like class of mammalian genes, in which mutually exclusive RNA structures control mutually exclusive alternative splicing. In sum, we propose that long-range base-pairings carry an important, yet unconsidered part of the splicing code, and that, even by modest estimates, there must be thousands of such potentially regulatory structures conserved throughout the evolutionary history of mammals.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alternative Splicing*
  • Animals
  • Base Sequence
  • Conserved Sequence
  • Doublecortin-Like Kinases
  • HEK293 Cells
  • Humans
  • Intracellular Signaling Peptides and Proteins / genetics
  • Molecular Sequence Data
  • Nucleic Acid Conformation
  • Protein Serine-Threonine Kinases / genetics
  • RNA Precursors / chemistry*
  • RNA Precursors / genetics*
  • RNA Splice Sites
  • RNA Splicing*
  • Sequence Analysis, RNA

Substances

  • Intracellular Signaling Peptides and Proteins
  • RNA Precursors
  • RNA Splice Sites
  • DCLK1 protein, human
  • Doublecortin-Like Kinases
  • Protein Serine-Threonine Kinases