Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Apr 1;32(7-8):577-591.
doi: 10.1101/gad.312058.118. Epub 2018 Apr 17.

Most human introns are recognized via multiple and tissue-specific branchpoints

Affiliations
Free PMC article

Most human introns are recognized via multiple and tissue-specific branchpoints

Jose Mario Bello Pineda et al. Genes Dev. .
Free PMC article

Abstract

Although branchpoint recognition is an essential component of intron excision during the RNA splicing process, the branchpoint itself is frequently assumed to be a basal, rather than regulatory, sequence feature. However, this assumption has not been systematically tested due to the technical difficulty of identifying branchpoints and quantifying their usage. Here, we analyzed ∼1.31 trillion reads from 17,164 RNA sequencing data sets to demonstrate that almost all human introns contain multiple branchpoints. This complexity holds even for constitutive introns, 95% of which contain multiple branchpoints, with an estimated five to six branchpoints per intron. Introns upstream of the highly regulated ultraconserved poison exons of SR genes contain twice as many branchpoints as the genomic average. Approximately three-quarters of constitutive introns exhibit tissue-specific branchpoint usage. In an extreme example, we observed a complete switch in branchpoint usage in the well-studied first intron of HBB (β-globin) in normal bone marrow versus metastatic prostate cancer samples. Our results indicate that the recognition of most introns is unexpectedly complex and tissue-specific and suggest that alternative splicing catalysis typifies the majority of introns even in the absence of differences in the mature mRNA.

Keywords: RNA; alternative splicing; branchpoint.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Genome-wide branchpoint annotation from RNA-seq data. (A) Overview of our branchpoint detection algorithm (see also Supplemental Figure S1). (B) Branchpoint annotation of SRSF5. For simplicity, only the intron-distal splice site of a competing 5′ splice site event within the first intron is illustrated in the exon–intron structure. (Vertical red bars) Branchpoints; (horizontal black lines) 5′ splice site–branchpoint pairs. The plot is based on an image from the University of California at Santa Cruz (UCSC) Genome Browser (Meyer et al. 2013). (C) Branchpoint detection rate as a function of the number of sequenced lariats. We randomly sampled from all sequenced lariats analyzed in our study and computed the number of distinct 5′ splice site–branchpoint pairs detected. As 5′ splice site–branchpoint pairs were not reported by other studies, we illustrated the number of reported branchpoints instead. For Taggart et al. (2017), we illustrated their “high-confidence” set of branchpoints. (D) Fraction of all RefSeq constitutive introns with one or more mapped branchpoints. (E) Distribution of mapped branchpoints among different annotation classes. (RefSeq const.) RefSeq constitutive introns; (RefSeq non-const.) RefSeq nonconstitutive introns; [annotated (non-RefSeq)] introns present in the UCSC, Ensembl, or Mixture of Isoforms (MISO) annotation databases but not RefSeq; (unannotated) introns formed by unannotated ligation of annotated 5′ and 3′ splice sites.
Figure 2.
Figure 2.
Branchpoint position, but not sequence context, is constrained. (A) Sequence logos of branchpoint contexts. The plot is restricted to branchpoints within RefSeq constitutive introns. (B) Histogram of branchpoint positions relative to the 3′ splice site, where position −1 nt corresponds to the last intronic nucleotide. Vertical dashed lines at −20, −28, and −49 nt illustrate the 10th, 50th, and 90th percentiles of positions for U2-type introns. The plot is restricted to branchpoints within RefSeq constitutive introns. (C) As in B but for U2-type introns classified as constitutive or retained. To ensure that the analyzed sets of introns were disjoint, we restricted to constitutive introns that did not overlap introns annotated as potentially retained in the MISO version 2.0 annotation even if those introns did not exhibit retention in our data. The vertical dashed line at −28 nt illustrates the median position for constitutive introns. (D) As in B but for U2-type introns classified as constitutive or upstream of a cassette exon. To ensure that the analyzed sets of introns were disjoint, we restricted to constitutive introns that did not overlap introns associated with cassette exons even if those cassette exons did not exhibit alternative splicing in our data. The vertical dashed line at −28 nt illustrates the median position for constitutive introns.
Figure 3.
Figure 3.
Most constitutively spliced introns contain multiple branchpoints. (A,B) Branchpoint annotations of introns within POLR3A (A) and MBNL1, SNX9, and VASP (B) based on RNA-seq analysis as well as direct lariat sequencing. Colors indicate the evidence supporting each branchpoint. Examples of sequenced lariats are shown for POLR3A. (C) The fraction of constitutive introns with multiple branchpoints as a function of the number of sequenced lariats with a mismatch at the branchpoint. Error bars indicate 95% confidence interval estimated with a proportion test. (D) As in C but illustrating the mean number of branchpoints per intron. Error bars indicate standard deviation of the mean, estimated by bootstrapping. (E) Branchpoint usage as a function of the relative branchpoint position. Branchpoint usage is defined as the number of sequenced lariats supporting a given 5′ splice site–branchpoint pair divided by the total number of sequenced lariats mapped to that 5′ splice site. Each point corresponds to a single branchpoint. The plot is restricted to constitutive introns with two or more branchpoints. The two most commonly used branchpoints per intron are illustrated. (F) As in E but illustrating estimated binding energy to the U2 snRNA sequence AUGAUGUG for each branchpoint context.
Figure 4.
Figure 4.
Regulated alternative splicing is associated with high branchpoint multiplicity. (A) Branchpoint annotation for SRSF3. Sequence conservation was performed with phastCons 100-vertebrate conservation track (Siepel et al. 2005). The plot was based on an image from the UCSC Genome Browser (Meyer et al. 2013). (B) Branchpoint annotation for the intron upstream of the SRSF3 poison exon, based on RNA-seq analysis as well as direct lariat sequencing. Colors indicate the evidence supporting each branchpoint. (C) The mean number of branchpoints detected in each of the illustrated classes of introns. Alternative splicing annotations were based on the MISO version 2.0 isoform database (Katz et al. 2010). The plot is restricted to introns with ≥25 sequenced lariats to help control for intron-specific variability in lariat sequencing depth. Error bars indicate standard deviation of the mean, estimated by bootstrapping. (D) As in C but illustrating the frequencies with which each branchpoint nucleotide occurs. (E) As in C but illustrating the mean estimated U2 snRNA-binding energy. Error bars indicate standard deviation of the mean, estimated by bootstrapping.
Figure 5.
Figure 5.
Tissue-specific branchpoint usage is common. (A) Branchpoint annotation and estimated branchpoint usage for the first intron of HBB. (N) Number of sequenced lariats with a mismatch at the inferred branchpoint. Error bars indicate 95% confidence intervals estimated with the binomial proportion test. P-values were estimated with the binomial proportion test. Branchpoints at positions −32 nt, −37 nt (the canonical branchpoint annotated biochemically) (Ruskin et al. 1984), and −41 nt were annotated with moderate, rather than high, confidence due to the nonuniqueness of the HBB intronic sequence. (B,C) As in A but for the indicated introns of VASP (B) and SRSF3 (C). Data are from direct lariat sequencing. P-values were estimated with the multinomial proportion test. The plot is restricted to branchpoints exhibiting differential branchpoint usage across the indicated samples, defined as a tissue-specific difference in branchpoint usage of ≥10% with an associated P-value ≤0.01 (two-sided test for difference in proportion). The illustrated percentages do not add up to 100% because the plot is restricted to differentially used branchpoints. (D) Detection of tissue-specific branchpoint usage in VASP and SRSF3 relative to the empirical false discovery rate (FDR) for each intron. Empirical FDRs were estimated by identifying differential branchpoint usage between technical replicates. P-values were estimated by comparing the frequencies of differential branchpoint usage detected between tissues and between technical replicates (two-sided test for difference in proportion). (E) The fraction of constitutive introns exhibiting tissue-specific branchpoint usage within the GTEx data set. (Left panel) Introns binned by the total number of sequenced lariats across all 54 tissues sampled by the GTEx project. (Right panel) Introns binned by the mean number of sequenced lariats per tissue. Error bars indicate 95% confidence intervals estimated with a proportion test.

Similar articles

Cited by

References

    1. Alsafadi S, Houy A, Battistella A, Popova T, Wassef M, Henry E, Tirode F, Constantinou A, Piperno-Neumann S, Roman-Roman S, et al. 2016. Cancer-associated SF3B1 mutations affect alternative splicing by promoting alternative branchpoint usage. Nat Commun 7: 10615. - PMC - PubMed
    1. Bradley RK, Merkin J, Lambert NJ, Burge CB. 2012. Alternative splicing of RNA triplets is often regulated and accelerates proteome evolution. PLoS Biol 10: e1001229. - PMC - PubMed
    1. Braunschweig U, Barbosa-Morais NL, Pan Q, Nachman EN, Alipanahi B, Gonatopoulos-Pournatzis T, Frey B, Irimia M, Blencowe BJ. 2014. Widespread intron retention in mammals functionally tunes transcriptomes. Genome Res 24: 1774–1786. - PMC - PubMed
    1. Corvelo A, Hallegger M, Smith CWJ, Eyras E. 2010. Genome-wide association between branch point properties and alternative splicing. PLoS Comput Biol 6: e1001016. - PMC - PubMed
    1. Darman RB, Seiler M, Agrawal AA, Lim KH, Peng S, Aird D, Bailey SL, Bhavsar EB, Chan B, Colla S, et al. 2015. Cancer-associated SF3B1 hotspot mutations induce cryptic 3′ splice site selection through use of a different branch point. Cell Rep 13: 1033–1045. - PubMed

Publication types

Substances

LinkOut - more resources