Improvement of peptide identification with considering the abundance of mRNA and peptide

Chunwei Ma; Shaohang Xu; Geng Liu; Xin Liu; Xun Xu; Bo Wen; Siqi Liu

doi:10.1186/s12859-017-1491-5

Improvement of peptide identification with considering the abundance of mRNA and peptide

BMC Bioinformatics. 2017 Feb 16;18(1):109. doi: 10.1186/s12859-017-1491-5.

Authors

Chunwei Ma¹, Shaohang Xu¹, Geng Liu¹, Xin Liu¹, Xun Xu¹, Bo Wen², Siqi Liu³

Affiliations

¹ BGI-Shenzhen, Shenzhen, 518083, China.
² BGI-Shenzhen, Shenzhen, 518083, China. wenbo@genomics.cn.
³ BGI-Shenzhen, Shenzhen, 518083, China. siqiliu@genomics.cn.

Abstract

Background: Tandem mass spectrometry (MS/MS) followed by database search is a main approach to identify peptides/proteins in proteomic studies. A lot of effort has been devoted to improve the identification accuracy and sensitivity for peptides/proteins, such as developing advanced algorithms and expanding protein databases.

Results: Herein, we described a new strategy for enhancing the sensitivity of protein/peptide identification through combination of mRNA and peptide abundance in Percolator. In our strategy, a new workflow for peptide identification is established on the basis of the abundance of transcripts and potential novel transcripts derived from RNA-Seq and abundance of peptides towards the same life species. We demonstrate the utility of this strategy by two MS/MS datasets and the results indicate that about 5% ~ 8% improvement of peptide identification can be achieved with 1% FDR in peptide level by integrating the peptide abundance, the transcript abundance and potential novel transcripts from RNA-Seq data. Meanwhile, 181 and 154 novel peptides were identified in the two datasets, respectively.

Conclusions: We have demonstrated that this strategy could enable improvement of peptide/protein identification and discovery of novel peptides, as compared with the traditional search methods.

Keywords: Bioinformatics; Machine learning; Mass spectrometry; Proteogenomics; RNA-Seq; Shotgun proteomics.

MeSH terms

Algorithms
Databases, Protein
Peptides* / analysis
Peptides* / chemistry
Peptides* / genetics
Proteomics / methods*
RNA, Messenger* / analysis
RNA, Messenger* / genetics
Tandem Mass Spectrometry / methods*

Substances

Peptides
RNA, Messenger