Dynamic expression of 3' UTRs revealed by Poisson hidden Markov modeling of RNA-Seq: implications in gene expression profiling

Gene. 2013 Sep 25;527(2):616-23. doi: 10.1016/j.gene.2013.06.052. Epub 2013 Jul 9.

Abstract

RNA sequencing (RNA-Seq) allows for the identification of novel exon-exon junctions and quantification of gene expression levels. We show that from RNA-Seq data one may also detect utilization of alternative polyadenylation (APA) in 3' untranslated regions (3' UTRs) known to play a critical role in the regulation of mRNA stability, cellular localization and translation efficiency. Given the dynamic nature of APA, it is desirable to examine the APA on a sample by sample basis. We used a Poisson hidden Markov model (PHMM) of RNA-Seq data to identify potential APA in human liver and brain cortex tissues leading to shortened 3' UTRs. Over three hundred transcripts with shortened 3' UTRs were detected with sensitivity >75% and specificity >60%. Tissue-specific 3' UTR shortening was observed for 32 genes with a q-value ≤ 0.1. When compared to alternative isoforms detected by Cufflinks or MISO, our PHMM method agreed on over 100 transcripts with shortened 3' UTRs. Given the increasing usage of RNA-Seq for gene expression profiling, using PHMM to investigate sample-specific 3' UTR shortening could be an added benefit from this emerging technology.

Keywords: 3′ UTRs; 3′ untranslated regions; APA; Alternative polyadenylation; BIC; Bayesian information criterion; CDFs; EM; EST; Expectation and Maximization; GEO; Gene Expression Omnibus; IVT; MicroRNA; Microarray; PAS-Seq; PHMM; Poisson hidden Markov model; RACE; RNA-Seq; RNA-sequencing; SRA; SVM; Sequence Read Archive; Untranslated region; alternative polyadenylation; base-pairs; bp; chip design files; expressed sequence tag; in vitro transcription; miRNA; polyadenylation site sequencing; rapid amplification of cDNA ends; support vector machine.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, N.I.H., Intramural

MeSH terms

  • 3' Untranslated Regions*
  • Gene Expression Profiling*
  • Markov Chains*
  • Poisson Distribution*
  • Sequence Analysis, RNA*

Substances

  • 3' Untranslated Regions