MHC-I peptides are a group of important immunopeptides presented by major histocompatibility complex (MHC) on the cell surface for immune recognition. The majority of reported MHC-I peptides are derived from protein coding sequences, and noncanonical peptides translated from small open reading frames (sORF) are largely unknown due to the lack of accurate and sensitive detection methods. Herein we report an efficient approach that implements complementary bioinformatic strategies to improve the identification of noncanonical MHC-I peptides. In a database search strategy, noncanonical immunopeptides mapping was optimized by combining three complementary pipelines to construct predicted sORF databases from Ribo-seq data. In a de novo peptide sequencing strategy, MS data search results were filtered against sORF databases to pin down additional noncanonical immunopeptides. In total, 308 noncanonical immunopeptides were identified from two tumor cell lines with selected ones vigorously validated. Our approach is a handy solution to identify noncanonical MHC peptides with Ribo-seq and MS data. Meanwhile, the novel noncanonical immunopeptides identified with this method could shed insights on fundamental immunology as well as cancer immunotherapies.
Keywords: MHC-I peptides; Ribo-seq; database search; de novo sequencing; small open reading frames.