Epitope discovery with phylogenetic hidden Markov models

Mol Biol Evol. 2010 May;27(5):1212-20. doi: 10.1093/molbev/msq008. Epub 2010 Jan 20.


Existing methods for the prediction of immunologically active T-cell epitopes are based on the amino acid sequence or structure of pathogen proteins. Additional information regarding the locations of epitopes may be acquired by considering the evolution of viruses in hosts with different immune backgrounds. In particular, immune-dependent evolutionary patterns at sites within or near T-cell epitopes can be used to enhance epitope identification. We have developed a mutation-selection model of T-cell epitope evolution that allows the human leukocyte antigen (HLA) genotype of the host to influence the evolutionary process. This is one of the first examples of the incorporation of environmental parameters into a phylogenetic model and has many other potential applications where the selection pressures exerted on an organism can be related directly to environmental factors. We combine this novel evolutionary model with a hidden Markov model to identify contiguous amino acid positions that appear to evolve under immune pressure in the presence of specific host immune alleles and that therefore represent potential epitopes. This phylogenetic hidden Markov model provides a rigorous probabilistic framework that can be combined with sequence or structural information to improve epitope prediction. As a demonstration, we apply the model to a data set of HIV-1 protein-coding sequences and host HLA genotypes.

MeSH terms

  • Alleles
  • Bayes Theorem
  • Epitopes, T-Lymphocyte / genetics*
  • Epitopes, T-Lymphocyte / immunology*
  • HIV Core Protein p24 / immunology
  • HLA-B Antigens / immunology
  • Humans
  • Markov Chains*
  • Models, Genetic
  • Models, Immunological
  • Phylogeny*
  • Probability


  • Epitopes, T-Lymphocyte
  • HIV Core Protein p24
  • HLA-B Antigens
  • HLA-B57 antigen