The dystrophin protein encoding DMD gene is the longest human gene. The 2.2 Mb long human dystrophin transcript takes 16 hours to be transcribed and is co-transcriptionally spliced. It contains long introns (24 over 10kb long, 5 over 100kb long) and the heterogeneity in intron size makes it an ideal transcript to study different aspects of the human splicing process. Splicing is a complex process and much is unknown regarding the splicing of long introns in human genes. Here, we used ultra-deep transcript sequencing to characterize splicing of the dystrophin transcripts in 3 different human skeletal muscle cell lines, and explored the order of intron removal and multi-step splicing. Coverage and read pair analyses showed that around 40% of the introns were not always removed sequentially. Additionally, for the first time, we report that non-consecutive intron removal resulted in 3 or more joined exons which are flanked by unspliced introns and we defined these joined exons as an exon block. Lastly, computational and experimental data revealed that, for the majority of dystrophin introns, multistep splicing events are used to splice out a single intron. Overall, our data show for the first time in a human transcript, that multi-step intron removal is a general feature of mRNA splicing.
Keywords: Capture-pre-mRNA-seq; Duchenne muscular dystrophy; exon blocks; intron removal; nested splicing; next generation sequencing; order of splicing; recursive splicing; splicing.