A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes

Anal Chem. 2003 Feb 15;75(4):768-74. doi: 10.1021/ac0258709.


This paper investigates the use of survival functions and expectation values to evaluate the results of protein identification experiments. These functions are standard statistical measures that can be used to reduce various protein identification scoring schemes to a common, easily interpretably representation. The relative merits of scoring systems were explored using this approach, as well as the effects of altering primary identification parameters. We would advocate the widespread use of these simple statistical measures to simplify and standardize the reporting of the confidence of protein identification results, allowing the users of different identification algorithms to compare their results in a straightforward and statistically significant manner. A method is described for measuring these distributions using information that is being discarded by most protein identification search engines, resulting in accurate survival functions that are specific to any combination of scoring algorithms, sequence databases, and mass spectra.

MeSH terms

  • Mass Spectrometry / methods*
  • Proteins / analysis*
  • Software
  • Statistics as Topic / methods*
  • Trypsin / analysis


  • Proteins
  • Trypsin