Amyloidogenic motifs revealed by n-gram analysis
- PMID: 29021608
- PMCID: PMC5636826
- DOI: 10.1038/s41598-017-13210-9
Amyloidogenic motifs revealed by n-gram analysis
Abstract
Amyloids are proteins associated with several clinical disorders, including Alzheimer's, and Creutzfeldt-Jakob's. Despite their diversity, all amyloid proteins can undergo aggregation initiated by short segments called hot spots. To find the patterns defining the hot spots, we trained predictors of amyloidogenicity, using n-grams and random forest classifiers. Since the amyloidogenicity may not depend on the exact sequence of amino acids but on their more general properties, we tested 524,284 reduced amino acid alphabets of different lengths (three to six letters) to find the alphabet providing the best performance in cross-validation. The predictor based on this alphabet, called AmyloGram, was benchmarked against the most popular tools for the detection of amyloid peptides using an external data set and obtained the highest values of performance measures (AUC: 0.90, MCC: 0.63). Our results showed sequential patterns in the amyloids which are strongly correlated with hydrophobicity, a tendency to form β-sheets, and lower flexibility of amino acid residues. Among the most informative n-grams of AmyloGram we identified 15 that were previously confirmed experimentally. AmyloGram is available as the web-server: http://smorfland.uni.wroc.pl/shiny/AmyloGram/ and as the R package AmyloGram. R scripts and data used to produce the results of this manuscript are available at http://github.com/michbur/AmyloGramAnalysis .
Conflict of interest statement
The authors declare that they have no competing interests.
Figures
Similar articles
-
FISH Amyloid - a new method for finding amyloidogenic segments in proteins based on site specific co-occurrence of aminoacids.BMC Bioinformatics. 2014 Feb 24;15:54. doi: 10.1186/1471-2105-15-54. BMC Bioinformatics. 2014. PMID: 24564523 Free PMC article.
-
Bioinformatics methods for identification of amyloidogenic peptides show robustness to misannotated training data.Sci Rep. 2021 Apr 26;11(1):8934. doi: 10.1038/s41598-021-86530-6. Sci Rep. 2021. PMID: 33903613 Free PMC article.
-
Breaking the amyloidogenicity code: methods to predict amyloids from amino acid sequence.FEBS Lett. 2013 Apr 17;587(8):1089-95. doi: 10.1016/j.febslet.2012.12.006. Epub 2012 Dec 20. FEBS Lett. 2013. PMID: 23262221 Review.
-
Machine learning study of classifiers trained with biophysiochemical properties of amino acids to predict fibril forming Peptide motifs.Protein Pept Lett. 2012 Sep;19(9):917-23. doi: 10.2174/092986612802084429. Protein Pept Lett. 2012. PMID: 22486618
-
Amyloid peptides and proteins in review.Rev Physiol Biochem Pharmacol. 2007;159:1-77. doi: 10.1007/112_2007_0701. Rev Physiol Biochem Pharmacol. 2007. PMID: 17846922 Review.
Cited by
-
Prion-like proteins: from computational approaches to proteome-wide analysis.FEBS Open Bio. 2021 Sep;11(9):2400-2417. doi: 10.1002/2211-5463.13213. Epub 2021 Jun 17. FEBS Open Bio. 2021. PMID: 34057308 Free PMC article. Review.
-
Prediction of Signal Peptides in Proteins from Malaria Parasites.Int J Mol Sci. 2018 Nov 22;19(12):3709. doi: 10.3390/ijms19123709. Int J Mol Sci. 2018. PMID: 30469512 Free PMC article.
-
A spatiotemporal reconstruction of the C. elegans pharyngeal cuticle reveals a structure rich in phase-separating proteins.Elife. 2022 Oct 19;11:e79396. doi: 10.7554/eLife.79396. Elife. 2022. PMID: 36259463 Free PMC article.
-
Bioinformatics Methods in Predicting Amyloid Propensity of Peptides and Proteins.Methods Mol Biol. 2022;2340:1-15. doi: 10.1007/978-1-0716-1546-1_1. Methods Mol Biol. 2022. PMID: 35167067
-
AggreProt: a web server for predicting and engineering aggregation prone regions in proteins.Nucleic Acids Res. 2024 Jul 5;52(W1):W159-W169. doi: 10.1093/nar/gkae420. Nucleic Acids Res. 2024. PMID: 38801076 Free PMC article.
References
-
- Chaturvedi, S. K., Siddiqi, M. K., Alam, P. & Khan, R. H. Protein misfolding and aggregation: Mechanism, factors and detection. Process. Biochem. 51(9), 1183–1192 (2016).
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
