Preprocessing of tandem mass spectrometric data to support automatic protein identification

Proteomics. 2003 Aug;3(8):1597-610. doi: 10.1002/pmic.200300486.


Liquid chromatography tandem mass spectrometry is a major tool for identifying proteins. The fragment spectra of peptides can be interpreted automatically in conjunction with a sequence database search. With the development of powerful automatic search engines, research now focuses on optimizing the result returned from database searches. We present a series of preprocessing steps for fragment spectra to increase the accuracy and specificity of automatic database searches. After processing, the correct amino acid sequences from the database can be related better to the fragment spectra. This increases the sensitivity and reliability of protein identifications, especially with very large genomic databanks, and can be important for the systematic characterization of post-translational modifications.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Calibration
  • Chromatography, High Pressure Liquid
  • Information Storage and Retrieval
  • Mass Spectrometry / methods*
  • Proteins / chemistry*


  • Proteins