iProphet: multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates
- PMID: 21876204
- PMCID: PMC3237071
- DOI: 10.1074/mcp.M111.007690
iProphet: multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates
Abstract
The combination of tandem mass spectrometry and sequence database searching is the method of choice for the identification of peptides and the mapping of proteomes. Over the last several years, the volume of data generated in proteomic studies has increased dramatically, which challenges the computational approaches previously developed for these data. Furthermore, a multitude of search engines have been developed that identify different, overlapping subsets of the sample peptides from a particular set of tandem mass spectrometry spectra. We present iProphet, the new addition to the widely used open-source suite of proteomic data analysis tools Trans-Proteomics Pipeline. Applied in tandem with PeptideProphet, it provides more accurate representation of the multilevel nature of shotgun proteomic data. iProphet combines the evidence from multiple identifications of the same peptide sequences across different spectra, experiments, precursor ion charge states, and modified states. It also allows accurate and effective integration of the results from multiple database search engines applied to the same data. The use of iProphet in the Trans-Proteomics Pipeline increases the number of correctly identified peptides at a constant false discovery rate as compared with both PeptideProphet and another state-of-the-art tool Percolator. As the main outcome, iProphet permits the calculation of accurate posterior probabilities and false discovery rate estimates at the level of sequence identical peptide identifications, which in turn leads to more accurate probability estimates at the protein level. Fully integrated with the Trans-Proteomics Pipeline, it supports all commonly used MS instruments, search engines, and computer platforms. The performance of iProphet is demonstrated on two publicly available data sets: data from a human whole cell lysate proteome profiling experiment representative of typical proteomic data sets, and from a set of Streptococcus pyogenes experiments more representative of organism-specific composite data sets.
Figures
Similar articles
-
Optimization of Search Engines and Postprocessing Approaches to Maximize Peptide and Protein Identification for High-Resolution Mass Data.J Proteome Res. 2015 Nov 6;14(11):4662-73. doi: 10.1021/acs.jproteome.5b00536. Epub 2015 Sep 30. J Proteome Res. 2015. PMID: 26390080 Free PMC article.
-
Comparative database search engine analysis on massive tandem mass spectra of pork-based food products for halal proteomics.J Proteomics. 2021 Jun 15;241:104240. doi: 10.1016/j.jprot.2021.104240. Epub 2021 Apr 21. J Proteomics. 2021. PMID: 33894373
-
MSblender: A probabilistic approach for integrating peptide identifications from multiple database search engines.J Proteome Res. 2011 Jul 1;10(7):2949-58. doi: 10.1021/pr2002116. Epub 2011 Apr 29. J Proteome Res. 2011. PMID: 21488652 Free PMC article.
-
A face in the crowd: recognizing peptides through database search.Mol Cell Proteomics. 2011 Nov;10(11):R111.009522. doi: 10.1074/mcp.R111.009522. Epub 2011 Aug 29. Mol Cell Proteomics. 2011. PMID: 21876205 Free PMC article. Review.
-
Building and searching tandem mass spectral libraries for peptide identification.Mol Cell Proteomics. 2011 Dec;10(12):R111.008565. doi: 10.1074/mcp.R111.008565. Epub 2011 Sep 6. Mol Cell Proteomics. 2011. PMID: 21900153 Free PMC article. Review.
Cited by
-
Peptide-Centric Proteome Analysis: An Alternative Strategy for the Analysis of Tandem Mass Spectrometry Data.Mol Cell Proteomics. 2015 Sep;14(9):2301-7. doi: 10.1074/mcp.O114.047035. Epub 2015 Jul 27. Mol Cell Proteomics. 2015. PMID: 26217018 Free PMC article.
-
Myotubularin-related proteins 3 and 4 interact with polo-like kinase 1 and centrosomal protein of 55 kDa to ensure proper abscission.Mol Cell Proteomics. 2015 Apr;14(4):946-60. doi: 10.1074/mcp.M114.046086. Epub 2015 Feb 6. Mol Cell Proteomics. 2015. PMID: 25659891 Free PMC article.
-
Nanospray FAIMS fractionation provides significant increases in proteome coverage of unfractionated complex protein digests.Mol Cell Proteomics. 2012 Apr;11(4):M111.014985. doi: 10.1074/mcp.M111.014985. Epub 2011 Dec 20. Mol Cell Proteomics. 2012. PMID: 22186714 Free PMC article.
-
A Scalable Approach for Protein False Discovery Rate Estimation in Large Proteomic Data Sets.Mol Cell Proteomics. 2015 Sep;14(9):2394-404. doi: 10.1074/mcp.M114.046995. Epub 2015 May 17. Mol Cell Proteomics. 2015. PMID: 25987413 Free PMC article.
-
AXL confers cell migration and invasion by hijacking a PEAK1-regulated focal adhesion protein network.Nat Commun. 2020 Jul 17;11(1):3586. doi: 10.1038/s41467-020-17415-x. Nat Commun. 2020. PMID: 32681075 Free PMC article.
References
-
- Aebersold R., Mann M. (2003) Mass spectrometry-based proteomics. Nature 422, 198–207 - PubMed
-
- Yates J. R., Ruse C. I., Nakorchevsky A. (2009) Proteomics by Mass Spectrometry: Approaches, Advances, and Applications. Annu. Rev. Biomed. Eng. 11, 49–79 - PubMed
-
- Deutsch E. W., Lam H., Aebersold R. (2008) Data analysis and bioinformatics tools for tandem mass spectrometry in proteomics. Physiol. Genomics 33, 18–25 - PubMed
-
- Carr S., Aebersold R., Baldwin M., Burlingame A., Clauser K., Nesvizhskii A. (2004) The Need for Guidelines in Publication of Peptide and Protein Identification Data: Working Group On Publication Guidelines For Peptide And Protein Identification Data. Mol. Cell. Proteomics 3, 531–533 - PubMed
-
- Nesvizhskii A. I., Vitek O., Aebersold R. (2007) Analysis and validation of proteomic data generated by tandem mass spectrometry. Nat. Methods 4, 787–797 - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
