NeuroPID: a predictor for identifying neuropeptide precursors from metazoan proteomes
- PMID: 24336809
- DOI: 10.1093/bioinformatics/btt725
NeuroPID: a predictor for identifying neuropeptide precursors from metazoan proteomes
Abstract
Motivation: The evolution of multicellular organisms is associated with increasing variability of molecules governing behavioral and physiological states. This is often achieved by neuropeptides (NPs) that are produced in neurons from a longer protein, named neuropeptide precursor (NPP). The maturation of NPs occurs through a sequence of proteolytic cleavages. The difficulty in identifying NPPs is a consequence of their diversity and the lack of applicable sequence similarity among the short functionally related NPs.
Results: Herein, we describe Neuropeptide Precursor Identifier (NeuroPID), a machine learning scheme that predicts metazoan NPPs. NeuroPID was trained on hundreds of identified NPPs from the UniProtKB database. Some 600 features were extracted from the primary sequences and processed using support vector machines (SVM) and ensemble decision tree classifiers. These features combined biophysical, chemical and informational-statistical properties of NPs and NPPs. Other features were guided by the defining characteristics of the dibasic cleavage sites motif. NeuroPID reached 89-94% accuracy and 90-93% precision in cross-validation blind tests against known NPPs (with an emphasis on Chordata and Arthropoda). NeuroPID also identified NPP-like proteins from extensively studied model organisms as well as from poorly annotated proteomes. We then focused on the most significant sets of features that contribute to the success of the classifiers. We propose that NPPs are attractive targets for investigating and modulating behavior, metabolism and homeostasis and that a rich repertoire of NPs remains to be identified.
Availability: NeuroPID source code is freely available at http://www.protonet.cs.huji.ac.il/neuropid
Similar articles
-
NeuroPID: a classifier of neuropeptide precursors.Nucleic Acids Res. 2014 Jul;42(Web Server issue):W182-6. doi: 10.1093/nar/gku363. Epub 2014 May 3. Nucleic Acids Res. 2014. PMID: 24792159 Free PMC article.
-
NeuroPP: A Tool for the Prediction of Neuropeptide Precursors Based on Optimal Sequence Composition.Interdiscip Sci. 2019 Mar;11(1):108-114. doi: 10.1007/s12539-018-0287-2. Epub 2018 Mar 10. Interdiscip Sci. 2019. PMID: 29525981
-
Annotation of novel neuropeptide precursors in the migratory locust based on transcript screening of a public EST database and mass spectrometry.BMC Genomics. 2006 Aug 9;7:201. doi: 10.1186/1471-2164-7-201. BMC Genomics. 2006. PMID: 16899111 Free PMC article.
-
Mono- and dibasic proteolytic cleavage sites in insect neuroendocrine peptide precursors.Arch Insect Biochem Physiol. 2000 Feb;43(2):49-63. doi: 10.1002/(SICI)1520-6327(200002)43:2<49::AID-ARCH1>3.0.CO;2-M. Arch Insect Biochem Physiol. 2000. PMID: 10644969 Review.
-
Neuropeptide S: anatomy, pharmacology, genetics and physiological functions.Results Probl Cell Differ. 2008;46:145-58. doi: 10.1007/400_2007_051. Results Probl Cell Differ. 2008. PMID: 18204825 Review.
Cited by
-
Prediction of neuropeptide precursors and differential expression of adipokinetic hormone/corazonin-related peptide, hugin and corazonin in the brain of malaria vector Nyssorhynchus albimanus during a Plasmodium berghei infection.Curr Res Insect Sci. 2021 Apr 22;1:100014. doi: 10.1016/j.cris.2021.100014. eCollection 2021. Curr Res Insect Sci. 2021. PMID: 36003598 Free PMC article.
-
ProteinBERT: a universal deep-learning model of protein sequence and function.Bioinformatics. 2022 Apr 12;38(8):2102-2110. doi: 10.1093/bioinformatics/btac020. Bioinformatics. 2022. PMID: 35020807 Free PMC article.
-
Sequence-based identification of recombination spots using pseudo nucleic acid representation and recursive feature extraction by linear kernel SVM.BMC Bioinformatics. 2014 Nov 20;15(1):340. doi: 10.1186/1471-2105-15-340. BMC Bioinformatics. 2014. PMID: 25409550 Free PMC article.
-
Overlooked Short Toxin-Like Proteins: A Shortcut to Drug Design.Toxins (Basel). 2017 Oct 29;9(11):350. doi: 10.3390/toxins9110350. Toxins (Basel). 2017. PMID: 29109389 Free PMC article.
-
Advances in Mass Spectrometric Tools for Probing Neuropeptides.Annu Rev Anal Chem (Palo Alto Calif). 2015;8:485-509. doi: 10.1146/annurev-anchem-071114-040210. Epub 2015 Jun 11. Annu Rev Anal Chem (Palo Alto Calif). 2015. PMID: 26070718 Free PMC article. Review.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous
