Proteomic repository data submission, dissemination, and reuse: key messages

Expert Rev Proteomics. 2022 Jul-Dec;19(7-12):297-310. doi: 10.1080/14789450.2022.2160324. Epub 2022 Dec 26.

Abstract

Introduction: The creation of ProteomeXchange data workflows in 2012 transformed the field of proteomics, consisting of the standardization of data submission and dissemination and enabling the widespread reanalysis of public MS proteomics data worldwide. ProteomeXchange has triggered a growing trend toward public dissemination of proteomics data, facilitating the assessment, reuse, comparative analyses, and extraction of new findings from public datasets. By 2022, the consortium is integrated by PRIDE, PeptideAtlas, MassIVE, jPOST, iProX, and Panorama Public.

Areas covered: Here, we review and discuss the current ecosystem of resources, guidelines, and file formats for proteomics data dissemination and reanalysis. Special attention is drawn to new exciting quantitative and post-translational modification-oriented resources. The challenges and future directions on data depositions including the lack of metadata and cloud-based and high-performance software solutions for fast and reproducible reanalysis of the available data are discussed.

Expert opinion: The success of ProteomeXchange and the amount of proteomics data available in the public domain have triggered the creation and/or growth of other protein knowledgebase resources. Data reuse is a leading, active, and evolving field; supporting the creation of new formats, tools, and workflows to rediscover and reshape the public proteomics data.

Keywords: Proteomics databases; computational biology; data integration; data reuse and reanalysis; mass spectrometry; open access data; protein expression; proteomics; public proteomics data; standard file formats.

Publication types

  • Review

MeSH terms

  • Databases, Protein
  • Ecosystem*
  • Humans
  • Proteins / metabolism
  • Proteomics*
  • Software

Substances

  • Proteins