An algorithm for reducing the time necessary to match a large set of peptide tandem mass spectra with a list of protein sequences is described. This algorithm breaks the process into multiple steps. A rapid survey step identifies all protein sequences that are reasonable candidates for a match with a set of tandem mass spectra. These candidates are then used as models, which are refined by detailed analysis of the set of tandem mass spectra for evidence of incomplete enzymatic hydrolysis, non-specific hydrolysis and chemical modifications of amino acid residues resulting from either post-translational modifications or sample handling. Compared with current one-step methods for matching proteins to mass spectra, this multiple-step method can decrease the time required for the calculation by several orders of magnitude.
Copyright 2003 John Wiley & Sons, Ltd.