Computational approaches to peptide identification via tandem MS

Methods Mol Biol. 2010;604:23-42. doi: 10.1007/978-1-60761-444-9_3.

Abstract

The peptide identification problem lies at the heart of modern proteomic methodology, from which the presence of a particular protein or proteins in a sample may be inferred. The challenge is to find the most likely amino acid sequence, which corresponds to each tandem mass spectrum that has been collected, and produce some kind of score and associated statistical measure that the putative identification is correct. This approach assumes that the peptide (and parent protein) sequence in question is known and is present in the database which is to be searched, as opposed to de novo methods, which seek to identify the peptide ab initio. This chapter will provide an overview of the methods that common, popular software tools employ to search protein sequence databases to provide the non-expert reader with sufficient background to appreciate the choices they can make. This will cover the approaches used to compare experimental and theoretical spectra and some of the methods used to validate and provide higher confidence in the assignments.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Computational Biology / methods*
  • Databases, Protein*
  • Humans
  • Molecular Sequence Data
  • Peptides / analysis*
  • Proteomics / methods
  • Software*
  • Tandem Mass Spectrometry / methods*

Substances

  • Peptides