Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 15 (12), 550

Moderated Estimation of Fold Change and Dispersion for RNA-seq Data With DESeq2

Moderated Estimation of Fold Change and Dispersion for RNA-seq Data With DESeq2

Michael I Love et al. Genome Biol.

Abstract

In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html webcite.

Figures

Figure 1
Figure 1
Shrinkage estimation of dispersion. Plot of dispersion estimates over the average expression strength (A) for the Bottomly et al. [16] dataset with six samples across two groups and (B) for five samples from the Pickrell et al. [17] dataset, fitting only an intercept term. First, gene-wise MLEs are obtained using only the respective gene’s data (black dots). Then, a curve (red) is fit to the MLEs to capture the overall trend of dispersion-mean dependence. This fit is used as a prior mean for a second estimation round, which results in the final MAP estimates of dispersion (arrow heads). This can be understood as a shrinkage (along the blue arrows) of the noisy gene-wise estimates toward the consensus represented by the red line. The black points circled in blue are detected as dispersion outliers and not shrunk toward the prior (shrinkage would follow the dotted line). For clarity, only a subset of genes is shown, which is enriched for dispersion outliers. Additional file 1: Figure S1 displays the same data but with dispersions of all genes shown. MAP, maximum a posteriori; MLE, maximum-likelihood estimate.
Figure 2
Figure 2
Effect of shrinkage on logarithmic fold change estimates. Plots of the (A) MLE (i.e., no shrinkage) and (B) MAP estimate (i.e., with shrinkage) for the LFCs attributable to mouse strain, over the average expression strength for a ten vs eleven sample comparison of the Bottomly et al. [16] dataset. Small triangles at the top and bottom of the plots indicate points that would fall outside of the plotting window. Two genes with similar mean count and MLE logarithmic fold change are highlighted with green and purple circles. (C) The counts (normalized by size factors s j) for these genes reveal low dispersion for the gene in green and high dispersion for the gene in purple. (D) Density plots of the likelihoods (solid lines, scaled to integrate to 1) and the posteriors (dashed lines) for the green and purple genes and of the prior (solid black line): due to the higher dispersion of the purple gene, its likelihood is wider and less peaked (indicating less information), and the prior has more influence on its posterior than for the green gene. The stronger curvature of the green posterior at its maximum translates to a smaller reported standard error for the MAP LFC estimate (horizontal error bar). adj., adjusted; LFC, logarithmic fold change; MAP, maximum a posteriori; MLE, maximum-likelihood estimate.
Figure 3
Figure 3
Stability of logarithmic fold changes. DESeq2 is run on equally split halves of the data of Bottomly et al. [16], and the LFCs from the halves are plotted against each other. (A) MLEs, i.e., without LFC shrinkage. (B) MAP estimates, i.e., with shrinkage. Points in the top left and bottom right quadrants indicate genes with a change of sign of LFC. Red points indicate genes with adjusted P value <0.1. The legend displays the root-mean-square error of the estimates in group I compared to those in group II. LFC, logarithmic fold change; MAP, maximum a posteriori; MLE, maximum-likelihood estimate; RMSE, root-mean-square error.
Figure 4
Figure 4
Hypothesis testing involving non-zero thresholds. Shown are plots of the estimated fold change over average expression strength (“minus over average”, or MA-plots) for a ten vs eleven comparison using the Bottomly et al. [16] dataset, with highlighted points indicating low adjusted P values. The alternate hypotheses are that logarithmic (base 2) fold changes are (A) greater than 1 in absolute value or (B) less than 1 in absolute value. adj., adjusted.
Figure 5
Figure 5
Variance stabilization and clustering after rlog transformation. Two transformations were applied to the counts of the Hammer et al. [26] dataset: the logarithm of normalized counts plus a pseudocount, i.e. f(K ij)= log2(K ij/s j+1), and the rlog. The gene-wise standard deviation of transformed values is variable across the range of the mean of counts using the logarithm (A), while relatively stable using the rlog (B). A hierarchical clustering on Euclidean distances and complete linkage using the rlog (D) transformed data clusters the samples into the groups defined by treatment and time, while using the logarithm-transformed counts (C) produces a more ambiguous result. sd, standard deviation.
Figure 6
Figure 6
Sensitivity and precision of algorithms across combinations of sample size and effect size. DESeq2 and edgeR often had the highest sensitivity of those algorithms that controlled the FDR, i.e., those algorithms which fall on or to the left of the vertical black line. For a plot of sensitivity against false positive rate, rather than FDR, see Additional file 1: Figure S8, and for the dependence of sensitivity on the mean of counts, see Additional file 1: Figure S9. Note that EBSeq filters low-count genes (see main text for details).
Figure 7
Figure 7
Benchmark of false positive calling. Shown are estimates of P(P value<0.01) under the null hypothesis. The FPR is the number of P values less than 0.01 divided by the total number of tests, from randomly selected comparisons of five vs five samples from the Pickrell et al. [17] dataset, with no known condition dividing the samples. Type-I error control requires that the tool does not substantially exceed the nominal value of 0.01 (black line). EBSeq results were not included in this plot as it returns posterior probabilities, which unlike P values are not expected to be uniformly distributed under the null hypothesis. FPR, false positive rate.
Figure 8
Figure 8
Sensitivity estimated from experimental reproducibility. Each algorithm’s sensitivity in the evaluation set (box plots) is evaluated using the calls of each other algorithm in the verification set (panels with grey label).
Figure 9
Figure 9
Precision estimated from experimental reproducibility. Each algorithm’s precision in the evaluation set (box plots) is evaluated using the calls of each other algorithm in the verification set (panels with grey label).

References

    1. Lönnstedt I, Speed T. Replicated microarray data. Stat Sinica. 2002;12:31–46.
    1. Robinson MD, Smyth GK. Moderated statistical tests for assessing differences in tag abundance. Bioinformatics. 2007;23:2881–2887. doi: 10.1093/bioinformatics/btm453. - DOI - PubMed
    1. McCarthy DJ, Chen Y, Smyth GK. Differential expression analysis of multifactor RNA-seq experiments with respect to biological variation. Nucleic Acids Res. 2012;40:4288–4297. doi: 10.1093/nar/gks042. - DOI - PMC - PubMed
    1. Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11:106. doi: 10.1186/gb-2010-11-10-r106. - DOI - PMC - PubMed
    1. Zhou Y-H, Xia K, Wright FA. A powerful and flexible approach to the analysis of RNA sequence count data. Bioinformatics. 2011;27:2672–2678. doi: 10.1093/bioinformatics/btr449. - DOI - PMC - PubMed

Publication types

LinkOut - more resources

Feedback