Quantitative guidelines to distinguish allergenic proteins from related, but non-allergenic ones are urgently needed for regulatory agencies, biotech companies and physicians. In a previous study, we found that allergenic proteins populate a relatively small number of protein families, as characterized by the Pfam database. However, these families also contain non-allergenic proteins, meaning that allergenic determinants must lie within more discrete regions of the sequence. Thus, new methods are needed to discriminate allergenic proteins within those families. Physical-Chemical Properties (PCP)-motifs specific for allergens within a Pfam class were determined for 17 highly populated protein domains. A novel scoring method based on PCP-motifs that characterize known allergenic proteins within these families was developed, and validated for those domains. The motif scores distinguished sequences of allergens from a large selection of 80,000 randomly selected non-allergenic sequences. The motif scores for the birch pollen allergen (Bet v 1) family, which also contains related fruit and nut allergens, correlated better than global sequence similarities with clinically observed cross-reactivities among those allergens. Further, we demonstrated that the average scores of allergen specific motifs for allergenic profilins are significantly different from the scores of non-allergenic profilins. Several of the selective motifs coincide with experimentally determined IgE epitopes of allergenic profilins. The motifs also discriminated allergenic pectate lyases, including Jun a 1 from mountain cedar pollen, from similar proteins in the human microbiome, which can be assumed to be non-allergens. The latter lacked key motifs characteristic of the known allergens, some of which correlate with known IgE binding sites.
Keywords: Allergens; Birch pollen allergen; Pectate lyase; Physical-chemical property motifs; Profilin; human microbiome.
Copyright © 2018 Elsevier Ltd. All rights reserved.