MAAP: malarial adhesins and adhesin-like proteins predictor

Proteins. 2008 Feb 15;70(3):659-66. doi: 10.1002/prot.21568.

Abstract

Malaria caused by protozoan parasites belonging to the genus Plasmodium is a dreaded disease, second only to tuberculosis. The emergence of parasites resistant to commonly used drugs and the lack of availability of vaccines aggravates the problem. One of the preventive approaches targets adhesion of parasites to host cells and tissues. Adhesion of parasites is mediated by proteins called adhesins. Abrogation of adhesion by either immunizing the host with adhesins or inhibiting the interaction using structural analogs of host cell receptors holds the potential to develop novel preventive strategies. The availability of complete genome sequence offers new opportunities for identifying adhesin and adhesin-like proteins. Development of computational algorithms can simplify this task and accelerate experimental characterization of the predicted adhesins from complete genomes. A curated positive dataset of experimentally known adhesins from Plasmodium species was prepared by careful examination of literature reports. "Controversial" or "hypothetical" adhesins were excluded. The negative dataset consisted of proteins representing various intracellular functions including information processing, metabolism, and interface (transporters). We did not include proteins likely to be on the surface with unknown adhesin properties or which are linked even indirectly to the adhesion process in either of the training sets. A nonhomology-based approach using 420 compositional properties of amino acid dipeptide and multiplet frequencies was used to develop MAAP Web server with Support Vector Machine (SVM) model classifier as its engine for the prediction of malarial adhesins and adhesin-like proteins. The MAAP engine has six SVM classifier models identified through an exhaustive search from 728 kernel parameters set. These models displayed an efficiency (Mathews correlation coefficient) of 0.860-0.967. The final prediction P(maap) score is the maximum score attained by a given sequence in any of the six models. The results of MAAP runs on complete proteomes of Plasmodium species revealed that in Plasmodium falciparum at P(maap) scores above 0.0, we observed a sensitivity of 100% with two false positives. In P. vivax and P. yoelii an optimal threshold P(maap) score of 0.7 was optimal with very few false positives (upto 5). Several new predictions were obtained. This list includes hypothetical protein PF14_0040, interspersed repeat antigen, STEVOR, liver stage antigen, SURFIN, RIFIN, stevor (3D7-stevorT3-2), mature parasite-infected erythrocyte surface antigen or P. falciparum erythrocyte membrane protein 2, merozoite surface protein 6 in P. falciparum, circumsporozoite proteins, microneme protein-1, Vir18, Vir12-like, Vir12, Vir18-like, Vir18-related and Vir4 in P. vivax, circumsporozoite protein/thrombospondin related anonymous proteins, 28 kDa ookinete surface protein, yir1, and yir4 of P. yoelii. Among these, a few proteins identified by MAAP were matched with those identified by other groups using different experimental and theoretical strategies. Most other interspersed repeat proteins in Plasmodium species had lower P(maap) scores. These new predictions could serve as new leads for further experimental characterization (MAAP webserver: http://maap.igib.res.in).

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adhesins, Bacterial / chemistry
  • Algorithms
  • Animals
  • Antigens, Surface / chemistry
  • Antigens, Surface / metabolism
  • Models, Theoretical
  • Plasmodium / pathogenicity*
  • Plasmodium falciparum / metabolism
  • Plasmodium vivax / metabolism
  • Plasmodium yoelii / metabolism
  • Protozoan Proteins / chemistry*
  • Protozoan Proteins / metabolism
  • Software*

Substances

  • Adhesins, Bacterial
  • Antigens, Surface
  • Protozoan Proteins