A uniform proteomics MS/MS analysis platform utilizing open XML file formats

Mol Syst Biol. 2005;1:2005.0017. doi: 10.1038/msb4100024. Epub 2005 Aug 2.


The analysis of tandem mass (MS/MS) data to identify and quantify proteins is hampered by the heterogeneity of file formats at the raw spectral data, peptide identification, and protein identification levels. Different mass spectrometers output their raw spectral data in a variety of proprietary formats, and alternative methods that assign peptides to MS/MS spectra and infer protein identifications from those peptide assignments each write their results in different formats. Here we describe an MS/MS analysis platform, the Trans-Proteomic Pipeline, which makes use of open XML file formats for storage of data at the raw spectral data, peptide, and protein levels. This platform enables uniform analysis and exchange of MS/MS data generated from a variety of different instruments, and assigned peptides using a variety of different database search programs. We demonstrate this by applying the pipeline to data sets generated by ThermoFinnigan LCQ, ABI 4700 MALDI-TOF/TOF, and Waters Q-TOF instruments, and searched in turn using SEQUEST, Mascot, and COMET.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Archaeal Proteins / chemistry
  • Chromatography, Liquid
  • Databases, Factual
  • Electronic Data Processing
  • Halobacterium / chemistry
  • Information Storage and Retrieval / methods*
  • Mass Spectrometry / instrumentation
  • Mass Spectrometry / methods*
  • Peptides / chemistry
  • Probability
  • Programming Languages*
  • Proteins / chemistry
  • Proteomics / methods*
  • Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization / instrumentation


  • Archaeal Proteins
  • Peptides
  • Proteins