Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jul 4;4(7):e1000107.
doi: 10.1371/journal.pcbi.1000107.

Quantitative predictions of peptide binding to any HLA-DR molecule of known sequence: NetMHCIIpan

Affiliations

Quantitative predictions of peptide binding to any HLA-DR molecule of known sequence: NetMHCIIpan

Morten Nielsen et al. PLoS Comput Biol. .

Abstract

CD4 positive T helper cells control many aspects of specific immunity. These cells are specific for peptides derived from protein antigens and presented by molecules of the extremely polymorphic major histocompatibility complex (MHC) class II system. The identification of peptides that bind to MHC class II molecules is therefore of pivotal importance for rational discovery of immune epitopes. HLA-DR is a prominent example of a human MHC class II. Here, we present a method, NetMHCIIpan, that allows for pan-specific predictions of peptide binding to any HLA-DR molecule of known sequence. The method is derived from a large compilation of quantitative HLA-DR binding events covering 14 of the more than 500 known HLA-DR alleles. Taking both peptide and HLA sequence information into account, the method can generalize and predict peptide binding also for HLA-DR molecules where experimental data is absent. Validation of the method includes identification of endogenously derived HLA class II ligands, cross-validation, leave-one-molecule-out, and binding motif identification for hitherto uncharacterized HLA-DR molecules. The validation shows that the method can successfully predict binding for HLA-DR molecules-even in the absence of specific data for the particular molecule in question. Moreover, when compared to TEPITOPE, currently the only other publicly available prediction method aiming at providing broad HLA-DR allelic coverage, NetMHCIIpan performs equivalently for alleles included in the training of TEPITOPE while outperforming TEPITOPE on novel alleles. We propose that the method can be used to identify those hitherto uncharacterized alleles, which should be addressed experimentally in future updates of the method to cover the polymorphism of HLA-DR most efficiently. We thus conclude that the presented method meets the challenge of keeping up with the MHC polymorphism discovery rate and that it can be used to sample the MHC "space," enabling a highly efficient iterative process for improving MHC class II binding predictions.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Schematic Illustration of the NetMHCIIpan Method.
(A) The HLA-DR pseudo sequence is constructed from polymorphic HLA-DR residues in potential contact with a bound peptide. (B) Position specific scoring matrix (PSSM) and peptide core alignment (shown in red) is made for each allele using the SMM-align method . N and C terminal peptide flanking regions, PFR, are identified as the up to three amino acids flanking the peptide-binding core. (C) Suboptimal peptides are presented to the NetMHCpan method with binding values normalized to the optimal peptide score (for the peptide shown in red) as described in Materials and Methods. (D) The NetMHCIIpan method is trained integrating data from all alleles. Input to the artificial neural network training includes the peptide core, composition and length of the N and C terminal PFR, length of the source peptide as well as the normalized binding affinity value (for details see Materials and Methods).
Figure 2
Figure 2. Predictive Performance in Terms of the Pearson's Correlation of the LOO Pan-Specific Method as a Function of the Distance to Its Nearest Neighbor HLA-DR Allele.
The nearest neighbor distance is estimated as described in Materials and Methods.
Figure 3
Figure 3. Cross-Validation Benchmark Evaluation.
The predictive performance of the pan-specific, SMM-align, and TEPITOPE methods compared in terms of the Pearson's correlation and AUC values averaged over the 11 alleles covered by the TEPITOPE method, respectively (data for the individual alleles is given in Table S2).
Figure 4
Figure 4. Prediction of Endogenously Presented Peptides.
The benchmark data set consists of 584 HLA-DR restricted ligands covering 28 HLA-DR alleles downloaded from the SYFPEITHI database as described in the text. For alleles not covered by the TEPITOPE method, the closest allele covered by the TEPITOPE method as identified by sequence similarity between the HLA pseudo-sequences is used. TEPITOPE Alleles give the average AUC performance over the 17 alleles covered by the TEPITOPE method, and non-TEPITOPE Alleles give the average AUC performance over the 11 alleles not covered by the TEPITOPE method (data for the individual alleles is given in Table S3).
Figure 5
Figure 5. HLA-DR Clustering from NetMHCIIpan Predictions.
The figure shows the clustering for 76 representative HLA-DR alleles. The tree was generated using the neighbor-joining algorithm from HLA distance matrices as described in the text. The circles are guides to the eye highlighting the suggested 12 HLA-DR supertypes.
Figure 6
Figure 6. Strategy for Effective and Rational Coverage of the MHC Polymorphism and Specificity.
(A) The pan-specific MHC class II prediction method is used to identify MHC alleles with novel binding specificities. These alleles have a predicted binding motif that is distant to all MHC class II molecules previously described. Subsequently, immunoassays are developed describing their binding specificity and data is fed back into a retraining of the pan-specific method. (B) Next, peptides with un-characterized binding affinity (high information peptides) are identifies, experimentally assayed and fed back into the retraining.

Similar articles

Cited by

References

    1. Castellino F, Zhong G, Germain RN. Antigen presentation by MHC class II molecules: invariant chain function, protein trafficking, and the molecular basis of diverse determinant capture. Hum Immunol. 1997;54:159–169. - PubMed
    1. Robinson J, Waller MJ, Parham P, Bodmer JG, Marsh SGE. IMGT/HLA Database—a sequence database for the human major histocompatibility complex. Nucleic Acids Res. 2001;29:210–213. - PMC - PubMed
    1. Nielsen M, Lundegaard C, Blicher T, Lamberth K, Harndahl M, et al. NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence. PLoS ONE. 2007;2:e796. doi:10.1371/journal.pone.0000796. - PMC - PubMed
    1. Sette A, Peters B. Immune epitope mapping in the post-genomic era: lessons for vaccine development. Curr Opin Immunol. 2007;19:106–110. - PubMed
    1. Lauemoller SL, Kesmir C, Corbet SL, Fomsgaard A, Holm A, et al. Identifying cytotoxic T cell epitopes from genomic and proteomic information: “The human MHC project.”. Rev Immunogenet. 2000;2:477–491. - PubMed

Publication types

Substances