High-quality MS/MS spectrum prediction for data-dependent and data-independent acquisition data analysis

Nat Methods. 2019 Jun;16(6):519-525. doi: 10.1038/s41592-019-0427-6. Epub 2019 May 27.

Abstract

Peptide fragmentation spectra are routinely predicted in the interpretation of mass-spectrometry-based proteomics data. However, the generation of fragment ions has not been understood well enough for scientists to estimate fragment ion intensities accurately. Here, we demonstrate that machine learning can predict peptide fragmentation patterns in mass spectrometers with accuracy within the uncertainty of measurement. Moreover, analysis of our models reveals that peptide fragmentation depends on long-range interactions within a peptide sequence. We illustrate the utility of our models by applying them to the analysis of both data-dependent and data-independent acquisition datasets. In the former case, we observe a q-value-dependent increase in the total number of peptide identifications. In the latter case, we confirm that the use of predicted tandem mass spectrometry spectra is nearly equivalent to the use of spectra from experimental libraries.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Biomarkers / blood*
  • Data Analysis*
  • Databases, Protein
  • HeLa Cells
  • Humans
  • Peptide Fragments / analysis*
  • Peptide Fragments / metabolism
  • Peptide Library*
  • Proteome / analysis*
  • Proteome / metabolism
  • Software*
  • Tandem Mass Spectrometry / methods*

Substances

  • Biomarkers
  • Peptide Fragments
  • Peptide Library
  • Proteome