Cancer neoantigen prioritization through sensitive and reliable proteogenomics analysis

Nat Commun. 2020 Apr 9;11(1):1759. doi: 10.1038/s41467-020-15456-w.


Genomics-based neoantigen discovery can be enhanced by proteomic evidence, but there remains a lack of consensus on the performance of different quality control methods for variant peptide identification in proteogenomics. We propose to use the difference between accurately predicted and observed retention times for each peptide as a metric to evaluate different quality control methods. To this end, we develop AutoRT, a deep learning algorithm with high accuracy in retention time prediction. Analysis of three cancer data sets with a total of 287 tumor samples using different quality control strategies results in substantially different numbers of identified variant peptides and putative neoantigens. Our systematic evaluation, using the proposed retention time metric, provides insights and practical guidance on the selection of quality control strategies. We implement the recommended strategy in a computational workflow named NeoFlow to support proteogenomics-based neoantigen prioritization, enabling more sensitive discovery of putative neoantigens.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Antigens, Neoplasm / genetics
  • Antigens, Neoplasm / metabolism*
  • Computational Biology / methods
  • Deep Learning
  • Genomics / methods*
  • Humans
  • Neoplasms / genetics
  • Neoplasms / immunology
  • Neoplasms / metabolism*
  • Peptides / genetics
  • Peptides / immunology
  • Peptides / metabolism
  • Proteogenomics / methods*
  • Proteome / genetics
  • Proteome / immunology
  • Proteome / metabolism
  • Proteomics / methods*
  • Tandem Mass Spectrometry / methods


  • Antigens, Neoplasm
  • Peptides
  • Proteome