The availability of big data sets ('OMICS') has greatly impacted fundamental and translational science. High-throughput analysis of HLA class I and II associated peptidomes by mass spectrometry (MS) has generated large datasets, with the last decade witnessing tremendous growth in the breadth and number of studies. Areas covered: For this, we first analyzed naturally processed peptide (NP) data captured within the IEDB to survey and characterize the current state of NP data. We next asked to what extent the NP data overlap with existing T cell epitope and MHC binding data. Expert commentary: The current collection of NP data represents a large and diverse set of class I/II peptides mostly derived from self-antigens. These data overlap only marginally with existing immunogenicity and binding data and it is thus difficult to ascertain the correspondence between the different assay methodologies. This highlights a need for unbiased studies benchmarking in model antigen systems how well MHC binding and NP data predicts immunogenicity. Going forward, efforts at generating an integrated process for capturing all NP, curating associated metadata and accessing NP data from an immunological viewpoint will be important for development of novel methods for identifying optimal target antigens and for class I and II epitope prediction.
Keywords: Epitopes; HLA; MHC; T cell; ligand; ligandome; mass spectrometry; naturally-processed; peptidome.