On the accuracy and limits of peptide fragmentation spectrum prediction

Anal Chem. 2011 Feb 1;83(3):790-6. doi: 10.1021/ac102272r. Epub 2010 Dec 22.


We estimated the reproducibility of tandem mass spectra for the widely used collision-induced dissociation (CID) of peptide ions. Using the Pearson correlation coefficient as a measure of spectral similarity, we found that the within-experiment reproducibility of fragment ion intensities is very high (about 0.85). However, across different experiments and instrument types/setups, the correlation decreases by more than 15% (to about 0.70). We further investigated the accuracy of current predictors of peptide fragmentation spectra and found that they are more accurate than the ad-hoc models generally used by search engines (e.g., SEQUEST) and, surprisingly, approaching the empirical upper limit set by the average across-experiment spectral reproducibility (especially for charge +1 and charge +2 precursor ions). These results provide evidence that, in terms of accuracy of modeling, predicted peptide fragmentation spectra provide a viable alternative to spectral libraries for peptide identification, with a higher coverage of peptides and lower storage requirements. Furthermore, using five data sets of proteome digests by two different proteases, we find that PeptideART (a data-driven machine learning approach) is generally more accurate than MassAnalyzer (an approach based on a kinetic model for peptide fragmentation) in predicting fragmentation spectra but that both models are significantly more accurate than the ad-hoc models.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Deinococcus / chemistry
  • Mass Spectrometry / methods*
  • Peptide Fragments / analysis*
  • Proteome / analysis
  • Reproducibility of Results
  • Saccharomyces cerevisiae / chemistry
  • Shewanella / chemistry


  • Peptide Fragments
  • Proteome