Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Aug 10;5(8):e12047.
doi: 10.1371/journal.pone.0012047.

SAW: A Method to Identify Splicing Events From RNA-Seq Data Based on Splicing Fingerprints

Free PMC article

SAW: A Method to Identify Splicing Events From RNA-Seq Data Based on Splicing Fingerprints

Kang Ning et al. PLoS One. .
Free PMC article


Splicing event identification is one of the most important issues in the comprehensive analysis of transcription profile. Recent development of next-generation sequencing technology has generated an extensive profile of alternative splicing. However, while many of these splicing events are between exons that are relatively close on genome sequences, reads generated by RNA-Seq are not limited to alternative splicing between close exons but occur in virtually all splicing events. In this work, a novel method, SAW, was proposed for the identification of all splicing events based on short reads from RNA-Seq. It was observed that short reads not in known gene models are actually absent words from known gene sequences. An efficient method to filter and cluster these short reads by fingerprint fragments of splicing events without aligning short reads to genome sequences was developed. Additionally, the possible splicing sites were also determined without alignment against genome sequences. A consensus sequence was then generated for each short read cluster, which was then aligned to the genome sequences. Results demonstrated that this method could identify more than 90% of the known splicing events with a very low false discovery rate, as well as accurately identify, a number of novel splicing events between distant exons.

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.


Figure 1
Figure 1. Clustering short reads and determination of candidate splice site by MAWs.
(a) Extraction of MAWs from gene sequences database. (b) Mapping MAW onto short read. (c) Clustering of short reads based on MAW to form consensus sequences. (d) Aligning consensus sequence to exon boundaries according to candidate splice site. Only consensus sequence (in color) is used to align against exon boundaries. Annotation: shadowed area indicates consensus sequence.
Figure 2
Figure 2. Different types of MAWs, the corresponding short reads and the splicing events that these short reads could identify.
Note that these types have the inclusive relationship: MAWs from known exons include those from known gene model; the same for short reads and splicing event. Short reads that are not present in known gene models are likely to correspond to splicing events between distant exons.
Figure 3
Figure 3. General scheme for splicing event identification by SAWs.
Note that filtration against known gene models is only necessary for identification of novel splicing events.
Figure 4
Figure 4. The increase of the proportion of splicing events identified by SAW with increasing number of short reads per splicing event.
Results are based on short read clusters with minimum size of 5 and 10.
Figure 5
Figure 5. The BLAT E-value distribution of 48mers that spanning known junctions, random junctions and junctions identified by SAW.
Figure 6
Figure 6. Example of novel splicing event identified by SAW from multiple reads in gene Ttll7s.
The splicing events (annotated by black reads) are not identified by based on UCSC mm9 gene models.

Similar articles

See all similar articles

Cited by 4 articles


    1. Black DL. Mechanisms of alternative pre-messenger RNA splicing. Annu Rev Biochem. 2003;72:291–336. - PubMed
    1. Matlin AJ, Clark F, Smith CW. Understanding alternative splicing: towards a cellular code. Nat Rev Mol Cell Biol. 2005;6:386–398. - PubMed
    1. Mo F, Hong X, Gao F, Du L, Wang J, et al. A compatible exon-exon junction database for the identification of exon skipping events using tandem mass spectrum data. BMC Bioinformatics. 2008;9:537. - PMC - PubMed
    1. Castle JC, Zhang C, Shah JK, Kulkarni AV, Kalsotra A, et al. Expression of 24,426 human alternative splicing events and predicted cis regulation in 48 tissues and cell lines. Nat Genet. 2008;40:1416–1425. - PMC - PubMed
    1. Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, et al. Alternative isoform regulation in human tissue transcriptomes. Nature. 2008;456:470–476. - PMC - PubMed