Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Sep 13:59:11.15.1-11.15.21.
doi: 10.1002/cpbi.33.

Data Analysis Pipeline for RNA-seq Experiments: From Differential Expression to Cryptic Splicing

Affiliations

Data Analysis Pipeline for RNA-seq Experiments: From Differential Expression to Cryptic Splicing

Hari Krishna Yalamanchili et al. Curr Protoc Bioinformatics. .

Abstract

RNA sequencing (RNA-seq) is a high-throughput technology that provides unique insights into the transcriptome. It has a wide variety of applications in quantifying genes/isoforms and in detecting non-coding RNA, alternative splicing, and splice junctions. It is extremely important to comprehend the entire transcriptome for a thorough understanding of the cellular system. Several RNA-seq analysis pipelines have been proposed to date. However, no single analysis pipeline can capture dynamics of the entire transcriptome. Here, we compile and present a robust and commonly used analytical pipeline covering the entire spectrum of transcriptome analysis, including quality checks, alignment of reads, differential gene/transcript expression analysis, discovery of cryptic splicing events, and visualization. Challenges, critical parameters, and possible downstream functional analysis pipelines associated with each step are highlighted and discussed. This unit provides a comprehensive understanding of state-of-the-art RNA-seq analysis pipeline and a greater understanding of the transcriptome. © 2017 by John Wiley & Sons, Inc.

Keywords: RNA-seq; alternative splicing; cryptic splicing; differential gene expression; differential isoform usage.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Quality check metrics on raw sequence reads from FastQC.
Bar plot of quality score (Phred score) for each base in the reads (a). Line plot showing the distribution of each nucleotide (A, C, G, T) in the sequence reads on each bases (b).
Figure 2.
Figure 2.. Visualization inspection of sample clustering.
PCA plot (a) and heatmap on correlation coefficient between samples (b) based on gene expression profiles of the six samples.
Figure 3.
Figure 3.. Results of differential gene expression analysis.
MA plot (a) and expression heatmap on the DEGs (adjusted P < 0.01) (b).
Figure 4.
Figure 4.. Representative plots from sleuth analysis shiny app.
(a) PCA, (b) Heatmap, (c) Scatterplot, and (d) Density plots.
Figure 5.
Figure 5.. Pie chart showing differential isoform usage for gene Gfra1 between WT and MT samples.
The usage of isoform P10382 is decreased from 80.7% in WT to 65.3 % in MT samples. On the other hand, the usage of P18895 increased in MT samples.
Figure 6:
Figure 6:. Visualizing splicing events in IGV:
(a) Screen shot showing exon coverage and (b) Sashimi plots with black arrows pointing to junction gains and red arrows pointing to junction losses.

Similar articles

Cited by

References

    1. Anders S, Pyl PT and Huber W (2015). “HTSeq--a Python framework to work with high-throughput sequencing data.” Bioinformatics 31(2): 166–169. - PMC - PubMed
    1. Bray NL, Pimentel H, Melsted P and Pachter L (2016). “Near-optimal probabilistic RNA-seq quantification.” Nat Biotechnol 34(5): 525–527. - PubMed
    1. Conesa A, Madrigal P, Tarazona S, Gomez-Cabrero D, Cervera A, McPherson A, Szczesniak MW, Gaffney DJ, Elo LL, Zhang X and Mortazavi A (2016). “A survey of best practices for RNA-seq data analysis.” Genome Biol 17: 13. - PMC - PubMed
    1. Dou T, Xu J, Gao Y, Gu J, Ji C, Xie Y and Zhou Y (2010). “Evolution of peroxisome proliferator-activated receptor gamma alternative splicing.” Front Biosci (Elite Ed) 2: 1334–1343. - PubMed
    1. Drăghici S (2012). Statistics and data analysis for microarrays using R and Bioconductor Boca Raton, FL, CRC Press.

LinkOut - more resources