Improvement of peptide identification with considering the abundance of mRNA and peptide

BMC Bioinformatics. 2017 Feb 16;18(1):109. doi: 10.1186/s12859-017-1491-5.

Abstract

Background: Tandem mass spectrometry (MS/MS) followed by database search is a main approach to identify peptides/proteins in proteomic studies. A lot of effort has been devoted to improve the identification accuracy and sensitivity for peptides/proteins, such as developing advanced algorithms and expanding protein databases.

Results: Herein, we described a new strategy for enhancing the sensitivity of protein/peptide identification through combination of mRNA and peptide abundance in Percolator. In our strategy, a new workflow for peptide identification is established on the basis of the abundance of transcripts and potential novel transcripts derived from RNA-Seq and abundance of peptides towards the same life species. We demonstrate the utility of this strategy by two MS/MS datasets and the results indicate that about 5% ~ 8% improvement of peptide identification can be achieved with 1% FDR in peptide level by integrating the peptide abundance, the transcript abundance and potential novel transcripts from RNA-Seq data. Meanwhile, 181 and 154 novel peptides were identified in the two datasets, respectively.

Conclusions: We have demonstrated that this strategy could enable improvement of peptide/protein identification and discovery of novel peptides, as compared with the traditional search methods.

Keywords: Bioinformatics; Machine learning; Mass spectrometry; Proteogenomics; RNA-Seq; Shotgun proteomics.

MeSH terms

  • Algorithms
  • Databases, Protein
  • Peptides* / analysis
  • Peptides* / chemistry
  • Peptides* / genetics
  • Proteomics / methods*
  • RNA, Messenger* / analysis
  • RNA, Messenger* / genetics
  • Tandem Mass Spectrometry / methods*

Substances

  • Peptides
  • RNA, Messenger