AllerTOP v.2--a server for in silico prediction of allergens

J Mol Model. 2014 Jun;20(6):2278. doi: 10.1007/s00894-014-2278-5. Epub 2014 May 31.


Allergy is an overreaction by the immune system to a previously encountered, ordinarily harmless substance--typically proteins--resulting in skin rash, swelling of mucous membranes, sneezing or wheezing, or other abnormal conditions. The use of modified proteins is increasingly widespread: their presence in food, commercial products, such as washing powder, and medical therapeutics and diagnostics, makes predicting and identifying potential allergens a crucial societal issue. The prediction of allergens has been explored widely using bioinformatics, with many tools being developed in the last decade; many of these are freely available online. Here, we report a set of novel models for allergen prediction utilizing amino acid E-descriptors, auto- and cross-covariance transformation, and several machine learning methods for classification, including logistic regression (LR), decision tree (DT), naïve Bayes (NB), random forest (RF), multilayer perceptron (MLP) and k nearest neighbours (kNN). The best performing method was kNN with 85.3% accuracy at 5-fold cross-validation. The resulting model has been implemented in a revised version of the AllerTOP server (

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Allergens / adverse effects*
  • Allergens / chemistry*
  • Allergens / immunology
  • Amino Acid Sequence
  • Artificial Intelligence
  • Bayes Theorem
  • Computational Biology / methods*
  • Databases, Protein*
  • Decision Support Techniques*
  • Decision Trees
  • Humans
  • Hypersensitivity / etiology*
  • Hypersensitivity / immunology
  • Logistic Models
  • Proteins / adverse effects*
  • Proteins / chemistry*
  • Proteins / immunology
  • Reproducibility of Results
  • Risk Assessment
  • Risk Factors
  • Structure-Activity Relationship


  • Allergens
  • Proteins