A model for isoform-level differential expression analysis using RNA-seq data without pre-specifying isoform structure

PLoS One. 2022 May 16;17(5):e0266162. doi: 10.1371/journal.pone.0266162. eCollection 2022.

Abstract

Motivation: Next generation sequencing (NGS) technology has been widely used in biomedical research, particularly on those genomics-related studies. One of NGS applications is the high-throughput mRNA sequencing (RNA-seq), which is usually applied to evaluate gene expression level (i.e. copies of isoforms), to identify differentially expressed genes, and to discover potential alternative splicing events. Popular tools for differential expression (DE) analysis using RNA-seq data include edgeR and DESeq. These methods tend to identify DE genes at the gene-level, which only allows them to compare the total size of isoforms, that is, sum of an isoform's copy number times its length over all isoforms. Naturally, these methods may fail to detect DE genes when the total size of isoforms remains similar but isoform-wise expression levels change dramatically. Other tools can perform isoform-level DE analysis only if isoform structures are known but would still fail for many non-model species whose isoform information are missing. To overcome these disadvantages, we developed an isoform-free (without need to pre-specify isoform structures) splicing-graph based negative binomial (SGNB) model for differential expression analysis at isoform level. Our model detects not only the change in the total size of isoforms but also the change in the isoform-wise expression level and hence is more powerful.

Results: We performed extensive simulations to compare our method with edgeR and DESeq. Under various scenarios, our method consistently achieved a higher detection power, while controlling pre-specified type I error. We also applied our method to a real data set to illustrate its applicability in practice.

MeSH terms

  • Alternative Splicing*
  • Gene Expression Profiling* / methods
  • Protein Isoforms / genetics
  • Protein Isoforms / metabolism
  • RNA-Seq
  • Sequence Analysis, RNA / methods

Substances

  • Protein Isoforms

Grant support

The authors received no specific funding for this work.