Deep learning boosts sensitivity of mass spectrometry-based immunopeptidomics

Nat Commun. 2021 Jun 7;12(1):3346. doi: 10.1038/s41467-021-23713-9.


Characterizing the human leukocyte antigen (HLA) bound ligandome by mass spectrometry (MS) holds great promise for developing vaccines and drugs for immune-oncology. Still, the identification of non-tryptic peptides presents substantial computational challenges. To address these, we synthesized and analyzed >300,000 peptides by multi-modal LC-MS/MS within the ProteomeTools project representing HLA class I & II ligands and products of the proteases AspN and LysN. The resulting data enabled training of a single model using the deep learning framework Prosit, allowing the accurate prediction of fragment ion spectra for tryptic and non-tryptic peptides. Applying Prosit demonstrates that the identification of HLA peptides can be improved up to 7-fold, that 87% of the proposed proteasomally spliced HLA peptides may be incorrect and that dozens of additional immunogenic neo-epitopes can be identified from patient tumors in published data. Together, the provided peptides, spectra and computational tools substantially expand the analytical depth of immunopeptidomics workflows.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cell Line
  • Deep Learning*
  • Epitopes
  • Extracellular Matrix Proteins / metabolism
  • HLA Antigens / immunology
  • Histocompatibility Antigens Class I / metabolism
  • Histocompatibility Antigens Class II / metabolism
  • Humans
  • Ligands
  • Mass Spectrometry
  • Molecular Medicine
  • Peptides / immunology*
  • Peptides / metabolism
  • Proteomics
  • Tandem Mass Spectrometry / methods*


  • ASPN protein, human
  • Epitopes
  • Extracellular Matrix Proteins
  • HLA Antigens
  • Histocompatibility Antigens Class I
  • Histocompatibility Antigens Class II
  • Ligands
  • Peptides