Combining High-Resolution and Exact Calibration To Boost Statistical Power: A Well-Calibrated Score Function for High-Resolution MS2 Data

J Proteome Res. 2018 Nov 2;17(11):3644-3656. doi: 10.1021/acs.jproteome.8b00206. Epub 2018 Oct 18.

Abstract

To achieve accurate assignment of peptide sequences to observed fragmentation spectra, a shotgun proteomics database search tool must make good use of the very high-resolution information produced by state-of-the-art mass spectrometers. However, making use of this information while also ensuring that the search engine's scores are well calibrated, that is, that the score assigned to one spectrum can be meaningfully compared to the score assigned to a different spectrum, has proven to be challenging. Here we describe a database search score function, the "residue evidence" (res-ev) score, that achieves both of these goals simultaneously. We also demonstrate how to combine calibrated res-ev scores with calibrated XCorr scores to produce a "combined p value" score function. We provide a benchmark consisting of four mass spectrometry data sets, which we use to compare the combined p value to the score functions used by several existing search engines. Our results suggest that the combined p value achieves state-of-the-art performance, generally outperforming MS Amanda and Morpheus and performing comparably to MS-GF+. The res-ev and combined p-value score functions are freely available as part of the Tide search engine in the Crux mass spectrometry toolkit ( http://crux.ms ).

Keywords: database search; high-resolution MS2; statistical calibration.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adrenal Glands / chemistry
  • Algorithms*
  • Amino Acid Sequence
  • Aquatic Organisms / chemistry
  • Benchmarking
  • Calibration
  • Complex Mixtures / chemistry
  • Databases, Protein
  • Datasets as Topic
  • Escherichia coli Proteins / chemistry*
  • Escherichia coli Proteins / classification
  • Escherichia coli Proteins / isolation & purification
  • Humans
  • Peptide Mapping / methods
  • Peptide Mapping / statistics & numerical data*
  • Peptides / chemistry*
  • Peptides / classification
  • Peptides / isolation & purification
  • Plasmodium falciparum / chemistry
  • Proteolysis
  • Proteomics / methods
  • Protozoan Proteins / chemistry*
  • Protozoan Proteins / classification
  • Protozoan Proteins / isolation & purification
  • Software
  • Tandem Mass Spectrometry / methods
  • Tandem Mass Spectrometry / statistics & numerical data*

Substances

  • Complex Mixtures
  • Escherichia coli Proteins
  • Peptides
  • Protozoan Proteins