Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates

Steven P Lund; Dan Nettleton; Davis J McCarthy; Gordon K Smyth

doi:10.1515/1544-6115.1826

Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates

Stat Appl Genet Mol Biol. 2012 Oct 22;11(5):/j/sagmb.2012.11.issue-5/1544-6115.1826/1544-6115.1826.xml. doi: 10.1515/1544-6115.1826.

Authors

Steven P Lund¹, Dan Nettleton, Davis J McCarthy, Gordon K Smyth

Affiliation

¹ Statistical Engineering Division, National Institute of Standards and Technology.

PMID: 23104842
DOI: 10.1515/1544-6115.1826

Abstract

Next generation sequencing technology provides a powerful tool for measuring gene expression (mRNA) levels in the form of RNA-sequence data. Method development for identifying differentially expressed (DE) genes from RNA-seq data, which frequently includes many low-count integers and can exhibit severe overdispersion relative to Poisson or binomial distributions, is a popular area of ongoing research. Here we present quasi-likelihood methods with shrunken dispersion estimates based on an adaptation of Smyth's (2004) approach to estimating gene-specific error variances for microarray data. Our suggested methods are computationally simple, analogous to ANOVA and compare favorably versus competing methods in detecting DE genes and estimating false discovery rates across a variety of simulations based on real data.

Publication types

Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Base Sequence
Databases, Genetic
Gene Expression Profiling / methods
Gene Expression Profiling / statistics & numerical data*
Likelihood Functions
RNA, Messenger / metabolism
Sequence Analysis, RNA / methods*

Substances

RNA, Messenger