Communicating Mass Spectrometry Quality Information in mzQC with Python, R, and Java

J Am Soc Mass Spectrom. 2024 Aug 7;35(8):1875-1882. doi: 10.1021/jasms.4c00174. Epub 2024 Jun 25.

Abstract

Mass spectrometry is a powerful technique for analyzing molecules in complex biological samples. However, inter- and intralaboratory variability and bias can affect the data due to various factors, including sample handling and preparation, instrument calibration and performance, and data acquisition and processing. To address this issue, the Quality Control (QC) working group of the Human Proteome Organization's Proteomics Standards Initiative has established the standard mzQC file format for reporting and exchanging information relating to data quality. mzQC is based on the JavaScript Object Notation (JSON) format and provides a lightweight yet versatile file format that can be easily implemented in software. Here, we present open-source software libraries to process mzQC data in three programming languages: Python, using pymzqc; R, using rmzqc; and Java, using jmzqc. The libraries follow a common data model and provide shared functionalities, including the (de)serialization and validation of mzQC files. We demonstrate use of the software libraries in a workflow for extracting, analyzing, and visualizing QC metrics from different sources. Additionally, we show how these libraries can be integrated with each other, with existing software tools, and in automated workflows for the QC of mass spectrometry data. All software libraries are available as open source under the MS-Quality-Hub organization on GitHub (https://github.com/MS-Quality-Hub).

MeSH terms

  • Humans
  • Mass Spectrometry* / methods
  • Mass Spectrometry* / standards
  • Programming Languages*
  • Proteomics* / methods
  • Proteomics* / standards
  • Quality Control*
  • Software*
  • Workflow