Quantitative predictions of peptide binding to any HLA-DR molecule of known sequence: NetMHCIIpan

PLoS Comput Biol. 2008 Jul 4;4(7):e1000107. doi: 10.1371/journal.pcbi.1000107.

Abstract

CD4 positive T helper cells control many aspects of specific immunity. These cells are specific for peptides derived from protein antigens and presented by molecules of the extremely polymorphic major histocompatibility complex (MHC) class II system. The identification of peptides that bind to MHC class II molecules is therefore of pivotal importance for rational discovery of immune epitopes. HLA-DR is a prominent example of a human MHC class II. Here, we present a method, NetMHCIIpan, that allows for pan-specific predictions of peptide binding to any HLA-DR molecule of known sequence. The method is derived from a large compilation of quantitative HLA-DR binding events covering 14 of the more than 500 known HLA-DR alleles. Taking both peptide and HLA sequence information into account, the method can generalize and predict peptide binding also for HLA-DR molecules where experimental data is absent. Validation of the method includes identification of endogenously derived HLA class II ligands, cross-validation, leave-one-molecule-out, and binding motif identification for hitherto uncharacterized HLA-DR molecules. The validation shows that the method can successfully predict binding for HLA-DR molecules-even in the absence of specific data for the particular molecule in question. Moreover, when compared to TEPITOPE, currently the only other publicly available prediction method aiming at providing broad HLA-DR allelic coverage, NetMHCIIpan performs equivalently for alleles included in the training of TEPITOPE while outperforming TEPITOPE on novel alleles. We propose that the method can be used to identify those hitherto uncharacterized alleles, which should be addressed experimentally in future updates of the method to cover the polymorphism of HLA-DR most efficiently. We thus conclude that the presented method meets the challenge of keeping up with the MHC polymorphism discovery rate and that it can be used to sample the MHC "space," enabling a highly efficient iterative process for improving MHC class II binding predictions.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Alleles
  • Amino Acid Sequence / physiology
  • Binding Sites / genetics
  • Binding Sites / immunology
  • Databases, Protein
  • HLA-DR Antigens / genetics
  • HLA-DR Antigens / immunology
  • HLA-DR Antigens / metabolism*
  • Humans
  • Major Histocompatibility Complex / genetics
  • Molecular Sequence Data
  • Predictive Value of Tests
  • Protein Binding / immunology
  • Protein Interaction Mapping / methods*
  • Reproducibility of Results
  • Sequence Alignment
  • Sequence Analysis, Protein

Substances

  • HLA-DR Antigens