Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Oct 7;17(1):787.
doi: 10.1186/s12864-016-3121-4.

Does conservation account for splicing patterns?

Affiliations
Free PMC article

Does conservation account for splicing patterns?

Michael Wainberg et al. BMC Genomics. .
Free PMC article

Abstract

Background: Alternative mRNA splicing is critical to proteomic diversity and tissue and species differentiation. Exclusion of cassette exons, also called exon skipping, is the most common type of alternative splicing in mammals.

Results: We present a computational model that predicts absolute (though not tissue-differential) percent-spliced-in of cassette exons more accurately than previous models, despite not using any 'hand-crafted' biological features such as motif counts. We achieve nearly identical performance using only the conservation score (mammalian phastCons) of each splice junction normalized by average conservation over 100 bp of the corresponding flanking intron, demonstrating that conservation is an unexpectedly powerful indicator of alternative splicing patterns. Using this method, we provide evidence that intronic splicing regulation occurs predominantly within 100 bp of the alternative splice sites and that conserved elements in this region are, as expected, functioning as splicing regulators. We show that among conserved cassette exons, increased conservation of flanking introns is associated with reduced inclusion. We also propose a new definition of intronic splicing regulatory elements (ISREs) that is independent of conservation, and show that most ISREs do not match known binding sites or splicing factors despite being predictive of percent-spliced-in.

Conclusions: These findings suggest that one mechanism for the evolutionary transition from constitutive to alternative splicing is the emergence of cis-acting splicing inhibitors. The association of our ISREs with differences in splicing suggests the existence of novel RNA-binding proteins and/or novel splicing roles for known RNA-binding proteins.

Keywords: Alternative splicing; Conservation; Splicing regulation.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
AUC of various alternative splicing models. (a) A convolutional DNN trained on sequences up to 384 bp from each of the four splice sites involved in cassette splicing (8 × 384 bp). (b) Same as (a), weighting the post-convolutional feature map by conservation (8 × 384 bp). (c) Same as (b), using only the 100 bp at each end of the cassette exon (2 × 100 bp). (d) Same as (c), using only the first 100 bp of each flanking intron (2 × 100 bp). (e) State of the art: the method of [11] (1393 features). (f) A DNN trained on only the conservation of the regions in (d) (2 × 100 features). (g) A DNN trained on junction conservation divided by average conservation over 100 bp (2 features). (h) A DNN trained on junction conservation (2 features). (i) A DNN trained on average conservation over 100 bp (2 features). (j) A DNN trained on the combined features of (e) and (g) (1393 + 2 features)
Fig. 2
Fig. 2
Most intronic splicing regulation occurs within 100 bp of the splice site. Correlation between junction/average conservation and tissue-averaged Ψ as the averaging window is increased from 1 to 384 bp of the flanking introns nearest the splice site. The correlation peaks at 132 bp for the upstream splice site and 92 bp for the downstream splice site
Fig. 3
Fig. 3
Junction versus average conservation. a) Upstream junction versus average conservation for all high-confidence (σ(Ψ)< 0.1) high Ψ (red) and low Ψ (blue) events (downstream results are similar). 99.8 % of all events fall into one of 3 regimes: high (> 0.5) junction and low average conservation (Old-), high junction and average conservation (Old+), and low junction and average conservation (New). Small Gaussian noise was applied in the y direction to avoid superimposing all tissues for each exon. Pie charts of Ψ for each regime and the whole dataset, also including medium Ψ events (green), are superimposed. b) The same data broken down first by Ψ range and then by conservation regime
Fig. 4
Fig. 4
6-mer count versus conservation enrichment. Total count of each 6-mer versus conservation enrichment in upstream flanking introns (downstream results are similar). The high variation among k-mer counts (σ = 182) indicates substantial selection pressure on flanking introns. More common 6-mers tend to be more conserved (Spearman correlation 0.411, p< 1e-166). The most severely under-conserved k-mers (conservation enrichment less than one-third the background) are also extremely rare, appearing in a tight band along the lower left edge of the plot
Fig. 5
Fig. 5
A ‘meta-correlation’ plot for 6-mers. The x coordinate of each 6-mer is the correlation across events where the k-mer appears of its count in the upstream 15–100 bp region (downstream results are similar) with tissue-averaged Ψ (ISE/ISS character), and the y coordinate gives the correlation of its conservation enrichment with Ψ (conservation bias). These two properties have a Spearman correlation of 0.0588 (p< 0.003) across all 6-mers. 6-mers appearing in fewer than 100 events or never appearing more than once in any event are not shown

Similar articles

Cited by

References

    1. Nilsen TW, Graveley BR. Expansion of the eukaryotic proteome by alternative splicing. Nature. 2010;463(7280):457–63. doi: 10.1038/nature08909. - DOI - PMC - PubMed
    1. Barash Y, Calarco JA, Gao W, Pan Q, Wang X, Shai O, Blencowe BJ, Frey BJ. Deciphering the splicing code. Nature. 2010;465(7294):53–9. doi: 10.1038/nature09000. - DOI - PubMed
    1. Licatalosi DD, Darnell RB. RNA processing and its regulation: global insights into biological networks. Nat Rev Genet. 2010;11(1):75–87. doi: 10.1038/nrg2673. - DOI - PMC - PubMed
    1. Xiong HY, Alipanahi B, Lee LJ, Bretschneider H, Merico D, Yuen RK, Hua Y, Gueroussov S, Najafabadi HS, Hughes TR, et al. The human splicing code reveals new insights into the genetic determinants of disease. Science. 2015;347(6218):1254806. doi: 10.1126/science.1254806. - DOI - PMC - PubMed
    1. Ray D, Kazan H, Chan ET, Castillo LP, Chaudhry S, Talukder S, Blencowe BJ, Morris Q, Hughes TR. Rapid and systematic analysis of the RNA recognition specificities of RNA-binding proteins. Nat Biotechnol. 2009;27(7):667–70. doi: 10.1038/nbt.1550. - DOI - PubMed

Publication types

Substances

Grants and funding

LinkOut - more resources