Epitopes that arise from a somatic mutation, also called neoepitopes, are now known to play a key role in cancer immunology and immunotherapy. Recent advances in high-throughput sequencing have made it possible to identify all mutations and thereby all potential neoepitope candidates in an individual cancer. However, most of these neoepitope candidates are not recognized by T cells of cancer patients when tested in vivo or in vitro, meaning they are not immunogenic. Especially in patients with a high mutational load, usually hundreds of potential neoepitopes are detected, highlighting the need to further narrow down this candidate list. In our study, we assembled a dataset of known, naturally processed, immunogenic neoepitopes to dissect the properties that make these neoepitopes immunogenic. The tools to use and thresholds to apply for prioritizing neoepitopes have so far been largely based on experience with epitope identification in other settings such as infectious disease and allergy. Here, we performed a detailed analysis on our dataset of curated immunogenic neoepitopes to establish the appropriate tools and thresholds in the cancer setting. To this end, we evaluated different predictors for parameters that play a role in a neoepitope's immunogenicity and suggest that using binding predictions and length-rescaling yields the best performance in discriminating immunogenic neoepitopes from a background set of mutated peptides. We furthermore show that almost all neoepitopes had strong predicted binding affinities (as expected), but more surprisingly, the corresponding non-mutated peptides had nearly as high affinities. Our results provide a rational basis for parameters in neoepitope filtering approaches that are being commonly used. Abbreviations: SNV: single nucleotide variant; nsSNV: nonsynonymous single nucleotide variant; ROC: receiver operating characteristic; AUC: area under ROC curve; HLA: human leukocyte antigen; MHC: major histocompatibility complex; PD-1: Programmed cell death protein 1; PD-L1 or CTLA-4: cytotoxic T-lymphocyte associated protein 4.
Keywords: HLA binding; bioinformatics; cancer; immunotherapy; neoantigen; neoepitope.