RNA sequencing (RNA-Seq) and MS-based shotgun proteomics are powerful high-throughput technologies for identifying and quantifying RNA transcripts and proteins, respectively. With the increasing affordability of these technologies, many projects have started to apply both to the same samples to achieve a more comprehensive understanding of biological systems. A major analytical challenge for such integrative projects is how to effectively leverage the complementary nature of RNA-Seq and shotgun proteomics data. RNA-Seq provides comprehensive information on mRNA abundance, alternative splicing, nucleotide variation, and structure alteration. Sample-specific protein databases derived from RNA-Seq data can better approximate the real protein pools in cell and tissue samples and thus improve protein identification. Meanwhile, proteomics data provide essential confirmation of the validity and functional relevance of novel findings from RNA-Seq data. At the quantitative level, mRNA and protein levels are only modestly correlated, suggesting strong involvement of posttranscriptional regulation in controlling gene expression. Here, we review recent studies at the interface of RNA-Seq and proteomics data. We discuss goals, accomplishments, and challenges in RNA-Seq-based proteogenomics. We also examine the current status and future potential of parallel transcriptome and proteome quantification in revealing posttranscriptional regulatory mechanisms.
Keywords: Data integration; Posttranscriptional regulation; Proteogenomics; Proteomics; RNA-Seq.
© 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.