Faster SEQUEST searching for peptide identification from tandem mass spectra

J Proteome Res. 2011 Sep 2;10(9):3871-9. doi: 10.1021/pr101196n. Epub 2011 Jul 29.

Abstract

Computational analysis of mass spectra remains the bottleneck in many proteomics experiments. SEQUEST was one of the earliest software packages to identify peptides from mass spectra by searching a database of known peptides. Though still popular, SEQUEST performs slowly. Crux and TurboSEQUEST have successfully sped up SEQUEST by adding a precomputed index to the search, but the demand for ever-faster peptide identification software continues to grow. Tide, introduced here, is a software program that implements the SEQUEST algorithm for peptide identification and that achieves a dramatic speedup over Crux and SEQUEST. The optimization strategies detailed here employ a combination of algorithmic and software engineering techniques to achieve speeds up to 170 times faster than a recent version of SEQUEST that uses indexing. For example, on a single Xeon CPU, Tide searches 10,000 spectra against a tryptic database of 27,499 Caenorhabditis elegans proteins at a rate of 1550 spectra per second, which compares favorably with a rate of 8.8 spectra per second for a recent version of SEQUEST with index running on the same hardware.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Databases, Protein
  • Peptide Fragments / chemistry*
  • Peptide Mapping / methods*
  • Proteomics / methods*
  • Reproducibility of Results
  • Software*
  • Tandem Mass Spectrometry / methods*

Substances

  • Peptide Fragments