The mzIdentML Data Standard Version 1.2, Supporting Advances in Proteome Informatics

Mol Cell Proteomics. 2017 Jul;16(7):1275-1285. doi: 10.1074/mcp.M117.068429. Epub 2017 May 17.

Abstract

The first stable version of the Proteomics Standards Initiative mzIdentML open data standard (version 1.1) was published in 2012-capturing the outputs of peptide and protein identification software. In the intervening years, the standard has become well-supported in both commercial and open software, as well as a submission and download format for public repositories. Here we report a new release of mzIdentML (version 1.2) that is required to keep pace with emerging practice in proteome informatics. New features have been added to support: (1) scores associated with localization of modifications on peptides; (2) statistics performed at the level of peptides; (3) identification of cross-linked peptides; and (4) support for proteogenomics approaches. In addition, there is now improved support for the encoding of de novo sequencing of peptides, spectral library searches, and protein inference. As a key point, the underlying XML schema has only undergone very minor modifications to simplify as much as possible the transition from version 1.1 to version 1.2 for implementers, but there have been several notable updates to the format specification, implementation guidelines, controlled vocabularies and validation software. mzIdentML 1.2 can be described as backwards compatible, in that reading software designed for mzIdentML 1.1 should function in most cases without adaptation. We anticipate that these developments will provide a continued stable base for software teams working to implement the standard. All the related documentation is accessible at http://www.psidev.info/mzidentml.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / standards*
  • Databases, Protein
  • Proteomics / standards*
  • Software