Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 43 (7), e47

Limma Powers Differential Expression Analyses for RNA-sequencing and Microarray Studies

Affiliations

Limma Powers Differential Expression Analyses for RNA-sequencing and Microarray Studies

Matthew E Ritchie et al. Nucleic Acids Res.

Abstract

limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

Figures

Figure 1.
Figure 1.
Schematic of the major components that are central to any limma analysis. For each gene g, we have a vector of gene expression values (yg) and a design matrix X that relates these values to some coefficients of interest (βg). The limma package includes statistical methods that (i) facilitate information borrowing using empirical Bayes methods to obtain posterior variance estimators (formula image), (ii) incorporate observation weights (wgj where j refers to sample) to allow for variations in data quality, (iii) allow variance modelling to accommodate technical or biological heterogeneity that may be present and (iv) pre-processing methods such as variance stabilization to reduce noise. These methods all help improve inference at both the gene and gene set level in small experiments.
Figure 2.
Figure 2.
The limma workflow. The diagram shows the main steps in a gene expression analysis, along with individual functions that might be used and the corresponding classes used to store data or results. Online documentation pages are available both for each individual function and for each major step.
Figure 3.
Figure 3.
Example diagnostic plots produced by limma. (A) Plot of variability versus count size for RNA-seq data, generated by voom with plot=TRUE. This plot shows that technical variability decreases with count size. Total variability asymptotes to biological variability as count sizes increases. (B) Mean-difference plot produced by the plotMA function for a two-colour microarray. The plot highlights negative (NC), constant (DR) and differentially expressed (D03, D10, U03, U10) spike-in controls. Regular probes are non-highlighted. (C) Multidimensional scaling (MDS) plot of a set of 30 microarrays, generated by plotMDS. All arrays are biologically identical and the plot reveals strong batch effects. Distances represent leading log2-fold changes between samples.
Figure 4.
Figure 4.
Example plots displaying results from DE and gene set analyses. (A) Volcano plot showing fold changes and posterior odds of DE for a particular comparison (RUNX1 over-expression versus wild-type in this case), generated by volcanoplot. Probes with P < 0.00001 are highlighted in red. (B) Venn diagram showing overlap in the number of DE genes for three comparisons from the same study as (A), generated by the vennDiagram function. (C) Gene set enrichment plot produced by barcodeplot. The central bar orders differentially expressed genes by significance from up to down upon Pax5 restoration in an RNA-seq experiment (7). The vertical bars mark genes that are induced (red) or repressed (blue) upon the transition from large cycling pre-B cells to small resting pre-B cells during normal B cell development according to the published literature (47). The plot shows a strong positive concordance between Pax5 restoration and the large to small cell transition. The roast function can be used to assign statistical significance to this correlation.

Similar articles

See all similar articles

Cited by 3,783 PubMed Central articles

See all "Cited by" articles

References

    1. Gentleman R.C., Carey V.J., Bates D.M., Bolstad B., Dettling M., Dudoit S., Ellis B., Gautier L., Ge Y., Gentry J., et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004;5:R80. - PMC - PubMed
    1. Smyth G. Limma: linear models for microarray data. In: Gentleman R., Carey V., Dudoit S., Irizarry R., Huber W., editors. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. New York: Springer; 2005. pp. 397–420.
    1. Peart M., Smyth G., Van Laar R., Bowtell D., Richon V., Marks P., Holloway A., Johnstone R. Identification and functional significance of genes regulated by structurally different histone deacetylase inhibitors. Proc. Natl. Acad. Sci. U.S.A. 2005;102:3697–3702. - PMC - PubMed
    1. Caiazzo M., Dell'Anno M.T., Dvoretskova E., Lazarevic D., Taverna S., Leo D., Sotnikova T.D., Menegon A., Roncaglia P., Colciago G., et al. Direct generation of functional dopaminergic neurons from mouse and human fibroblasts. Nature. 2011;476:224–227. - PubMed
    1. Hubert F., Kinkel S., Crewther P., Cannon P., Webster K., Link M., Uibo R., O'Bryan M., Meager A., Forehan S., et al. Aire-deficient c57bl/6 mice mimicking the common human 13-base pair deletion mutation present with only a mild autoimmune phenotype. J. Immunol. 2009;182:3902–3918. - PubMed

Publication types

MeSH terms

Feedback