Inference of splicing regulatory activities by sequence neighborhood analysis

PLoS Genet. 2006 Nov 24;2(11):e191. doi: 10.1371/journal.pgen.0020191. Epub 2006 Sep 28.

Abstract

Sequence-specific recognition of nucleic-acid motifs is critical to many cellular processes. We have developed a new and general method called Neighborhood Inference (NI) that predicts sequences with activity in regulating a biochemical process based on the local density of known sites in sequence space. Applied to the problem of RNA splicing regulation, NI was used to predict hundreds of new exonic splicing enhancer (ESE) and silencer (ESS) hexanucleotides from known human ESEs and ESSs. These predictions were supported by cross-validation analysis, by analysis of published splicing regulatory activity data, by sequence-conservation analysis, and by measurement of the splicing regulatory activity of 24 novel predicted ESEs, ESSs, and neutral sequences using an in vivo splicing reporter assay. These results demonstrate the ability of NI to accurately predict splicing regulatory activity and show that the scope of exonic splicing regulatory elements is substantially larger than previously anticipated. Analysis of orthologous exons in four mammals showed that the NI score of ESEs, a measure of function, is much more highly conserved above background than ESE primary sequence. This observation indicates a high degree of selection for ESE activity in mammalian exons, with surprisingly frequent interchangeability between ESE sequences.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Binding Sites
  • Cluster Analysis
  • Conserved Sequence
  • Enhancer Elements, Genetic / genetics
  • Exons / genetics
  • HeLa Cells
  • Humans
  • RNA Splice Sites / genetics*
  • RNA Splicing / genetics*
  • Reproducibility of Results
  • Sequence Analysis, DNA

Substances

  • RNA Splice Sites