Novel families of toxin-like peptides in insects and mammals: a computational approach

J Mol Biol. 2007 Jun 1;369(2):553-66. doi: 10.1016/j.jmb.2007.02.106. Epub 2007 Mar 15.


Most animal toxins are short proteins that appear in venom and vary in sequence, structure and function. A common characteristic of many such toxins is their apparent structural stability. Sporadic instances of endogenous toxin-like proteins that function in non-venom context have been reported. We have utilized machine learning methodology, based on sequence-derived features and guided by the notion of structural stability, in order to conduct a large-scale search for toxin and toxin-like proteins. Application of the method to insect and mammalian sequences revealed novel families of toxin-like proteins. One of these proteins shows significant similarity to ion channel inhibitors that are expressed in cone snail and assassin bug venom, and is surprisingly expressed in the bee brain. A toxicity assay in which the protein was injected to fish induced a strong yet reversible paralytic effect. We suggest that the protein may function as an endogenous modulator of voltage-gated Ca(2+) channels. Additionally, we have identified a novel mammalian cluster of toxin-like proteins that are expressed in the testis. We suggest that these proteins might be involved in regulation of nicotinic acetylcholine receptors that affect the acrosome reaction and sperm motility. Finally, we highlight a possible evolutionary link between venom toxins and antibacterial proteins. We expect our methodology to enhance the discovery of additional novel protein families.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Antimicrobial Cationic Peptides / chemistry
  • Antimicrobial Cationic Peptides / genetics
  • Apamin / chemistry
  • Apamin / genetics
  • Base Sequence
  • Bees
  • Computer Simulation*
  • Humans
  • Insect Proteins / chemistry
  • Insect Proteins / genetics
  • Insecta
  • Mice
  • Molecular Sequence Data
  • Neuropeptides / chemistry
  • Peptides / chemistry
  • Peptides / classification
  • Peptides / genetics*
  • Protein Conformation
  • Reproducibility of Results
  • Sequence Alignment
  • Toxins, Biological / chemistry*
  • Toxins, Biological / classification
  • Toxins, Biological / genetics*


  • Antimicrobial Cationic Peptides
  • Insect Proteins
  • Neuropeptides
  • Peptides
  • Toxins, Biological
  • Apamin

Associated data

  • GENBANK/AAY54877
  • GENBANK/BE015616
  • GENBANK/BP115546
  • GENBANK/BT022461
  • GENBANK/BX614868
  • GENBANK/BX627342
  • GENBANK/CD578321
  • GENBANK/CR528986
  • GENBANK/CX309613
  • GENBANK/DN297299
  • GENBANK/DT664910
  • GENBANK/DW209856
  • GENBANK/EAL27268
  • OMIM/248300
  • RefSeq/XM_001120252