Motivation: Enzymatic digestion of proteins before mass spectrometry analysis is a key process in metaproteomic workflows. Canonical metaproteomic data processing pipelines typically involve matching spectra produced by the mass spectrometer to a theoretical spectra database, followed by matching the identified peptides back to parent-proteins. However, the nature of enzymatic digestion produces peptides that can be found in multiple proteins due to conservation or chance, presenting difficulties with protein and functional assignment.
Results: To combat this challenge, we developed pepFunk, a peptide-centric metaproteomic workflow focused on the analysis of human gut microbiome samples. Our workflow includes a curated peptide database annotated with Kyoto Encyclopedia of Genes and Genomes (KEGG) terms and a gene set variation analysis-inspired pathway enrichment adapted for peptide-level data. Analysis using our peptide-centric workflow is fast and highly correlated to a protein-centric analysis, and can identify more enriched KEGG pathways than analysis using protein-level data. Our workflow is open source and available as a web application or source code to be run locally.
Availability and implementation: pepFunk is available online as a web application at https://shiny.imetalab.ca/pepFunk/ with open-source code available from https://github.com/northomics/pepFunk.
Contact: dfigeys@uottawa.ca.
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author(s) 2020. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.