psm_utils: A High-Level Python API for Parsing and Handling Peptide-Spectrum Matches and Proteomics Search Results

J Proteome Res. 2023 Feb 3;22(2):557-560. doi: 10.1021/acs.jproteome.2c00609. Epub 2022 Dec 12.

Abstract

A plethora of proteomics search engine output file formats are in circulation. This lack of standardized output files greatly complicates generic downstream processing of peptide-spectrum matches (PSMs) and PSM files. While standards exist to solve this problem, these are far from universally supported by search engines. Moreover, software libraries are available to read a selection of PSM file formats, but a package to parse PSM files into a unified data structure has been missing. Here, we present psm_utils, a Python package to read and write various PSM file formats and to handle peptidoforms, PSMs, and PSM lists in a unified and user-friendly Python-, command line-, and web-interface. psm_utils was developed with pragmatism and maintainability in mind, adhering to community standards and relying on existing packages where possible. The Python API and command line interface greatly facilitate handling various PSM file formats. Moreover, a user-friendly web application was built using psm_utils that allows anyone to interconvert PSM files and retrieve basic PSM statistics. psm_utils is freely available under the permissive Apache2 license at https://github.com/compomics/psm_utils.

Keywords: bioinformatics; data analysis; peptide identification; peptide-spectrum matches; proteomics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Peptides
  • Proteomics* / methods
  • Search Engine
  • Software*

Substances

  • Peptides