Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Dec 1;16:1021.
doi: 10.1186/s12864-015-2235-4.

Fusion Transcript Loci Share Many Genomic Features With Non-Fusion Loci

Affiliations
Free PMC article

Fusion Transcript Loci Share Many Genomic Features With Non-Fusion Loci

John Lai et al. BMC Genomics. .
Free PMC article

Erratum in

Abstract

Background: Fusion transcripts are found in many tissues and have the potential to create novel functional products. Here, we investigate the genomic sequences around fusion junctions to better understand the transcriptional mechanisms mediating fusion transcription/splicing. We analyzed data from prostate (cancer) cells as previous studies have shown extensively that these cells readily undergo fusion transcription.

Results: We used the FusionMap program to identify high-confidence fusion transcripts from RNAseq data. The RNAseq datasets were from our (N = 8) and other (N = 14) clinical prostate tumors with adjacent non-cancer cells, and from the LNCaP prostate cancer cell line that were mock-, androgen- (DHT), and anti-androgen- (bicalutamide, enzalutamide) treated. In total, 185 fusion transcripts were identified from all RNAseq datasets. The majority (76%) of these fusion transcripts were 'read-through chimeras' derived from adjacent genes in the genome. Characterization of sequences at fusion loci were carried out using a combination of the FusionMap program, custom Perl scripts, and the RNAfold program. Our computational analysis indicated that most fusion junctions (76%) use the consensus GT-AG intron donor-acceptor splice site, and most fusion transcripts (85%) maintained the open reading frame. We assessed whether parental genes of fusion transcripts have the potential to form complementary base pairing between parental genes which might bring them into physical proximity. Our computational analysis of sequences flanking fusion junctions at parental loci indicate that these loci have a similar propensity as non-fusion loci to hybridize. The abundance of repetitive sequences at fusion and non-fusion loci was also investigated given that SINE repeats are involved in aberrant gene transcription. We found few instances of repetitive sequences at both fusion and non-fusion junctions. Finally, RT-qPCR was performed on RNA from both clinical prostate tumors and adjacent non-cancer cells (N = 7), and LNCaP cells treated as above to validate the expression of seven fusion transcripts and their respective parental genes. We reveal that fusion transcript expression is similar to the expression of parental genes.

Conclusions: Fusion transcripts maintain the open reading frame, and likely use the same transcriptional machinery as non-fusion transcripts as they share many genomic features at splice/fusion junctions.

Figures

Fig. 1
Fig. 1
a Circos plot from RNAseq data of fusion transcripts from the Ren et. al. dataset [29], from our clinical prostate cancers and adjacent non-cancer prostate cells (n = 8), and from LNCaP prostate cancer cells that were treated with either 10 nM androgen (DHT) or 10 μM anti-androgen (bicalutamide and enzalutamide). b Venn diagram detailing how many unique fusion transcripts were detected between the different RNAseq datasets. c Venn diagram detailing how many unique fusion transcripts were detected between androgen or anti-androgen treated LNCaP cells
Fig. 2
Fig. 2
a Pie graph showing the proportion of fusion points that occur at the exon boundaries of one, both or neither genes that comprise the fusion transcript. b Bubble plot of the number of fusion transcripts that use the AT-AC, CT-AC, CT-GC, GC-AG, and GT-AG donor-acceptor splice sites. Bubble size represents the average gene expression (larger = greater expression) for fusion transcripts within that donor-acceptor class. c Pie chart of the percentage of fusion transcripts that maintain the original reading frames of the genes that comprise the fusion transcripts (inner pie chart). The outer pie chart represents the nucleotide position (0, 1, 2 = 1st, 2nd, and 3rd nucleotide, respectively) within the codon of the first (number before arrow) and second (number after arrow) genes at the fusion points of those respective genes
Fig. 3
Fig. 3
a Diagram showing 100 nt of genomic sequence upstream (solid line under gene) and downstream (dotted line under gene) of the point of fusion at the two genes comprising the fusion transcript that were used for hybridisation analysis. b The line graph represents the number of fusion transcripts that have complementary nucleotides (y-axis) at the respective distance (x-axis) from the point of fusion (x-axis = 0) between the up- and downstream sequences from gene 1 and gene 2. The histogram represents the average number of complementary nucleotides between the up- and down-stream sequences from gene 1 and gene 2. The MEME result (coloured ACGT nucleotides) represents motifs of complementary sequences between the up- and down-stream sequences from gene 1 and gene 2. Up- and down-stream sequences from random non fusion intron splice sites were used for comparison
Fig. 4
Fig. 4
a Diagram showing 100 nt of genomic sequence upstream (solid line under gene) and downstream (dotted line under gene) of the point of fusion at the two genes comprising the fusion transcript that were used to identify repetitive sequences. b Repeats from six families (DNA, LINE, Low complexity, LTR, Simple repeat, SINE) were detected at fusion (red regions) and non-fusion (random, blue regions) regions at both gene loci
Fig. 5
Fig. 5
Diagram of other fusion transcripts expressed at the seven candidate fusion loci. Red UCSC Bed tracks indicate fusion transcripts discovered by Iyer et al. [19]. Parental genes that fusion transcripts were derived in our study are represented as green tracks, and other genes located at that locus are represented as blue tracks. The fusion junctions discovered in this study is also shown

Similar articles

See all similar articles

Cited by 8 articles

See all "Cited by" articles

References

    1. Pennisi E. Genomics. ENCODE project writes eulogy for junk DNA. Science. 2012;337(6099):1159–1161. doi: 10.1126/science.337.6099.1159. - DOI - PubMed
    1. Gingeras TR. Implications of chimaeric non-co-linear transcripts. Nature. 2009;461(7261):206–211. doi: 10.1038/nature08452. - DOI - PMC - PubMed
    1. Akiva P, Toporik A, Edelheit S, Peretz Y, Diber A, Shemesh R, et al. Transcription-mediated gene fusion in the human genome. Genome Res. 2006;16(1):30–36. doi: 10.1101/gr.4137606. - DOI - PMC - PubMed
    1. Li X, Zhao L, Jiang H, Wang W. Short homologous sequences are strongly associated with the generation of chimeric RNAs in eukaryotes. J Mol Evol. 2009;68(1):56–65. doi: 10.1007/s00239-008-9187-0. - DOI - PubMed
    1. Frenkel-Morgenstern M, Lacroix V, Ezkurdia I, Levin Y, Gabashvili A, Prilusky J, et al. Chimeras taking shape: potential functions of proteins encoded by chimeric RNA transcripts. Genome Res. 2012;22(7):1231–1242. doi: 10.1101/gr.130062.111. - DOI - PMC - PubMed

Publication types

Feedback