Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jan 4;46(D1):D1237-D1247.
doi: 10.1093/nar/gkx664.

The SysteMHC Atlas Project

Free PMC article

The SysteMHC Atlas Project

Wenguang Shao et al. Nucleic Acids Res. .
Free PMC article


Mass spectrometry (MS)-based immunopeptidomics investigates the repertoire of peptides presented at the cell surface by major histocompatibility complex (MHC) molecules. The broad clinical relevance of MHC-associated peptides, e.g. in precision medicine, provides a strong rationale for the large-scale generation of immunopeptidomic datasets and recent developments in MS-based peptide analysis technologies now support the generation of the required data. Importantly, the availability of diverse immunopeptidomic datasets has resulted in an increasing need to standardize, store and exchange this type of data to enable better collaborations among researchers, to advance the field more efficiently and to establish quality measures required for the meaningful comparison of datasets. Here we present the SysteMHC Atlas (, a public database that aims at collecting, organizing, sharing, visualizing and exploring immunopeptidomic data generated by MS. The Atlas includes raw mass spectrometer output files collected from several laboratories around the globe, a catalog of context-specific datasets of MHC class I and class II peptides, standardized MHC allele-specific peptide spectral libraries consisting of consensus spectra calculated from repeat measurements of the same peptide sequence, and links to other proteomics and immunology databases. The SysteMHC Atlas project was created and will be further expanded using a uniform and open computational pipeline that controls the quality of peptide identifications and peptide annotations. Thus, the SysteMHC Atlas disseminates quality controlled immunopeptidomic information to the public domain and serves as a community resource toward the generation of a high-quality comprehensive map of the human immunopeptidome and the support of consistent measurement of immunopeptidomic sample cohorts.


Figure 1.
Figure 1.
Overview of the SysteMHC Atlas project. (A) The SysteMHC Atlas aims to be a long-term data-driven project that serves the community. It is linked to other repositories of proteomic data and consists of two main components: (i) a uniform computational pipeline for processing raw MS files and (ii) a web interface with storage, searching and browsing capabilities. First, shotgun/DDA-MS experimental data generated for specific projects are submitted by the data producers to PRIDE. Raw MS data are then uploaded into the SysteMHC Atlas and processed through a consistent and open computational pipeline (B) that controls the quality of peptide identification and peptide annotation to specific HLA alleles. Spectral libraries are generated and can be converted into high-quality HLA allele-specific peptide assay libraries, also available at SWATHAtlas. All the results generated by the computational pipeline are made available to the public domain via the SysteMHC Atlas web-based interface, which provides links to the Immune Epitope Database (IEDB) for accessing lists of peptides originally identified and published by the data producers. (B) Current computational pipeline used for generating the immunopeptidome- and spectral database for different HLA allotypes. MS output files generated from several types of instruments are first converted into mzXML file format and then searched using several open-source database search engines. The resulting peptide identifications are combined and statistically scored using PeptideProphet and iProphet within the Trans-Proteomic Pipeline (TPP) (30,31). The identified peptides are next annotated to their respective HLA allele in a fully automated fashion using the stand-alone software package of NetMHCcons 1.1 (29). Spectral libraries are generated using SpectraST (32). Allele-specific peptide spectral libraries are generated from multiple samples—an example for HLA-A03 is highlighted in red. Each HLA peptide is labeled with a unique and permanent library identifier (LibID). Details regarding the computational pipeline and how the data were processed are available at the SysteMHC Atlas website in the ‘ABOUT’ section.
Figure 2.
Figure 2.
Immunopeptidomics datasets used for building the first version of the SysteMHC Atlas. Data from 23 projects that collectively generated 1184 raw MS files constitute the initial contents of the SysteMHC Atlas. Each project is labeled with a unique SYSMHC identifier and linked to its corresponding PubMed, PRIDE and IEDB ID. For unpublished projects, IDs are not applicable (NA).
Figure 3.
Figure 3.
Cumulative number of MS/MS spectra versus cumulative number of distinct peptides for HLA class I alleles at FDR 1%. (A) All HLA class I peptides were combined. (B) HLA class I alleles that were frequently found in various datasets. Eventually, the curves are expected to reach saturation when most observable peptides will have been cataloged at 1% peptide FDR.
Figure 4.
Figure 4.
Explore page in the SysteMHC Atlas web-based interface. HLA allele-specific peptide spectral libraries can be downloaded here. The web interface can also be used to query the SysteMHC Atlas and find specific information. (A) As an example the source protein BIRC6 was searched and the Atlas returned back all HLA-associated peptides originating from this protein as well as the context (i.e. SysteMHC ID, Sample ID, iProphet score, HLA annotation score, spectral counts, assigned HLA type and class) in which this peptide was observed. Then, the user can click on a specific Sample ID hyperlink and be redirected to the corresponding raw MS files and metadata (e.g. tissue type, cell type, culture condition, purification method, antibody used, mass spectrometer used etc). (B) The peptide RLLDYVATV was searched and the Atlas returned back the datasets in which this peptide was observed. By clicking on the peptide sequence hyperlink, the user is redirected to a new page in which the LibID information is available for MS/MS spectra visualization. Information can be downloaded as .csv files for further analysis.
Figure 5.
Figure 5.
Data storage and visualization. To access information about specific datasets, the user selects a specific SYSMHC ID/Project name (e.g. SYSMHC00005) and clicks on ‘view dataset’ at the bottom left of the screen. The samples related to this project are then listed and linked to the number of replicates, organism, tissue and cell type of origin as well as the HLA typing information (upper panel). The user can then click on a specific Sample ID to visualize the metadata and to download the raw or converted mzXML MS files (red squares). A list of sample-specific HLA-associated peptides can be visualized at 1% peptide-level FDR (green squares). Sample-specific spectral libraries, including consensus fragment ion spectra, can be visualized and downloaded (orange and blue squares). Heat maps (black squares) are used to visualize the annotation of individual peptides to their respective HLA allele (dark blue peptides are predicted to be strong HLA binders according to NetMHCcons).

Similar articles

See all similar articles

Cited by 23 articles

  • Landscape mapping of shared antigenic epitopes and their cognate TCRs of tumor-infiltrating T lymphocytes in melanoma.
    Murata K, Nakatsugawa M, Rahman MA, Nguyen LT, Millar DG, Mulder DT, Sugata K, Saijo H, Matsunaga Y, Kagoya Y, Guo T, Anczurowski M, Wang CH, Burt BD, Ly D, Saso K, Easson A, Goldstein DP, Reedijk M, Ghazarian D, Pugh TJ, Butler MO, Mak TW, Ohashi PS, Hirano N. Murata K, et al. Elife. 2020 Apr 21;9:e53244. doi: 10.7554/eLife.53244. Elife. 2020. PMID: 32314731 Free PMC article.
  • Specificity of bispecific T cell receptors and antibodies targeting peptide-HLA.
    Holland CJ, Crean RM, Pentier JM, de Wet B, Lloyd A, Srikannathasan V, Lissin N, Lloyd KA, Blicher TH, Conroy PJ, Hock M, Pengelly RJ, Spinner TE, Cameron B, Potter EA, Jeyanthan A, Molloy PE, Sami M, Aleksic M, Liddy N, Robinson RA, Harper S, Lepore M, Pudney CR, van der Kamp MW, Rizkallah PJ, Jakobsen BK, Vuidepot A, Cole DK. Holland CJ, et al. J Clin Invest. 2020 May 1;130(5):2673-2688. doi: 10.1172/JCI130562. J Clin Invest. 2020. PMID: 32310221
  • Murine xenograft bioreactors for human immunopeptidome discovery.
    Heather JM, Myers PT, Shi F, Aziz-Zanjani MO, Mahoney KE, Perez M, Morin B, Brittsan C, Shabanowitz J, Hunt DF, Cobbold M. Heather JM, et al. Sci Rep. 2019 Dec 6;9(1):18558. doi: 10.1038/s41598-019-54700-2. Sci Rep. 2019. PMID: 31811195 Free PMC article.
  • The Human Immunopeptidome Project: A Roadmap to Predict and Treat Immune Diseases.
    Vizcaíno JA, Kubiniok P, Kovalchik KA, Ma Q, Duquette JD, Mongrain I, Deutsch EW, Peters B, Sette A, Sirois I, Caron E. Vizcaíno JA, et al. Mol Cell Proteomics. 2020 Jan;19(1):31-49. doi: 10.1074/mcp.R119.001743. Epub 2019 Nov 19. Mol Cell Proteomics. 2020. PMID: 31744855
  • The ProteomeXchange consortium in 2020: enabling 'big data' approaches in proteomics.
    Deutsch EW, Bandeira N, Sharma V, Perez-Riverol Y, Carver JJ, Kundu DJ, García-Seisdedos D, Jarnuczak AF, Hewapathirana S, Pullman BS, Wertz J, Sun Z, Kawano S, Okuda S, Watanabe Y, Hermjakob H, MacLean B, MacCoss MJ, Zhu Y, Ishihama Y, Vizcaíno JA. Deutsch EW, et al. Nucleic Acids Res. 2020 Jan 8;48(D1):D1145-D1152. doi: 10.1093/nar/gkz984. Nucleic Acids Res. 2020. PMID: 31686107 Free PMC article.
See all "Cited by" articles


    1. Istrail S., Florea L., Halldórsson B.V., Kohlbacher O., Schwartz R.S., Yap V.B., Yewdell J.W., Hoffman S.L. Comparative immunopeptidomics of humans and their pathogens. Proc. Natl. Acad. Sci. U.S.A. 2004; 101:13268–13272. - PMC - PubMed
    1. Caron E., Vincent K., Fortier M.-H., Laverdure J.-P., Bramoullé A., Hardy M.-P., Voisin G., Roux P.P., Lemieux S., Thibault P. et al. The MHC I immunopeptidome conveys to the cell surface an integrative view of cellular regulation. Mol. Syst. Biol. 2011; 7:533–533. - PMC - PubMed
    1. Caron E., Kowalewski D.J., Koh C.C., Sturm T., Schuster H., Aebersold R. Analysis of major histocompatibility complex (MHC) immunopeptidomes using mass spectrometry. Mol. Cell. Proteomics. 2015; 14:3105–3117. - PMC - PubMed
    1. Neefjes J., Jongsma M.L.M., Paul P., Bakke O. Towards a systems understanding of MHC class I and MHC class II antigen presentation. Nat. Rev. Immunol. 2011; 11:823–836. - PubMed
    1. Rock K.L., Reits E., Neefjes J. Present yourself! by MHC class I and MHC class II molecules. Trends Immunol. 2016; 37:724–737. - PMC - PubMed

Publication types