DeepLoc: prediction of protein subcellular localization using deep learning
- PMID: 29036616
- DOI: 10.1093/bioinformatics/btx431
DeepLoc: prediction of protein subcellular localization using deep learning
Erratum in
-
DeepLoc: prediction of protein subcellular localization using deep learning.Bioinformatics. 2017 Dec 15;33(24):4049. doi: 10.1093/bioinformatics/btx548. Bioinformatics. 2017. PMID: 29028934 No abstract available.
Abstract
Motivation: The prediction of eukaryotic protein subcellular localization is a well-studied topic in bioinformatics due to its relevance in proteomics research. Many machine learning methods have been successfully applied in this task, but in most of them, predictions rely on annotation of homologues from knowledge databases. For novel proteins where no annotated homologues exist, and for predicting the effects of sequence variants, it is desirable to have methods for predicting protein properties from sequence information only.
Results: Here, we present a prediction algorithm using deep neural networks to predict protein subcellular localization relying only on sequence information. At its core, the prediction model uses a recurrent neural network that processes the entire protein sequence and an attention mechanism identifying protein regions important for the subcellular localization. The model was trained and tested on a protein dataset extracted from one of the latest UniProt releases, in which experimentally annotated proteins follow more stringent criteria than previously. We demonstrate that our model achieves a good accuracy (78% for 10 categories; 92% for membrane-bound or soluble), outperforming current state-of-the-art algorithms, including those relying on homology information.
Availability and implementation: The method is available as a web server at http://www.cbs.dtu.dk/services/DeepLoc. Example code is available at https://github.com/JJAlmagro/subcellular_localization. The dataset is available at http://www.cbs.dtu.dk/services/DeepLoc/data.php.
Contact: jjalma@dtu.dk.
© The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Similar articles
-
DeepLoc 2.0: multi-label subcellular localization prediction using protein language models.Nucleic Acids Res. 2022 Jul 5;50(W1):W228-W234. doi: 10.1093/nar/gkac278. Nucleic Acids Res. 2022. PMID: 35489069 Free PMC article.
-
DeepLoc 2.1: multi-label membrane protein type prediction using protein language models.Nucleic Acids Res. 2024 Jul 5;52(W1):W215-W220. doi: 10.1093/nar/gkae237. Nucleic Acids Res. 2024. PMID: 38587188 Free PMC article.
-
An introduction to deep learning on biological sequence data: examples and solutions.Bioinformatics. 2017 Nov 15;33(22):3685-3690. doi: 10.1093/bioinformatics/btx531. Bioinformatics. 2017. PMID: 28961695 Free PMC article.
-
A Brief History of Protein Sorting Prediction.Protein J. 2019 Jun;38(3):200-216. doi: 10.1007/s10930-019-09838-3. Protein J. 2019. PMID: 31119599 Free PMC article. Review.
-
Protein subcellular localization prediction tools.Comput Struct Biotechnol J. 2024 Apr 15;23:1796-1807. doi: 10.1016/j.csbj.2024.04.032. eCollection 2024 Dec. Comput Struct Biotechnol J. 2024. PMID: 38707539 Free PMC article. Review.
Cited by
-
Multiple Parallel Fusion Network for Predicting Protein Subcellular Localization from Stimulated Raman Scattering (SRS) Microscopy Images in Living Cells.Int J Mol Sci. 2022 Sep 16;23(18):10827. doi: 10.3390/ijms231810827. Int J Mol Sci. 2022. PMID: 36142736 Free PMC article.
-
Convolutional neural networks (CNNs): concepts and applications in pharmacogenomics.Mol Divers. 2021 Aug;25(3):1569-1584. doi: 10.1007/s11030-021-10225-3. Epub 2021 May 24. Mol Divers. 2021. PMID: 34031788 Free PMC article.
-
Taro raphide-associated proteins: Allergens and crystal growth.Plant Direct. 2022 Sep 2;6(9):e443. doi: 10.1002/pld3.443. eCollection 2022 Sep. Plant Direct. 2022. PMID: 36091877 Free PMC article.
-
Genome-wide identification of a novel Na+ transporter from Bienertia sinuspersici and overexpression of BsHKT1;2 improved salt tolerance in Brassica rapa.Front Plant Sci. 2023 Dec 12;14:1302315. doi: 10.3389/fpls.2023.1302315. eCollection 2023. Front Plant Sci. 2023. PMID: 38192689 Free PMC article.
-
Inference of essential genes in Brugia malayi and Onchocerca volvulus by machine learning and the implications for discovering new interventions.Comput Struct Biotechnol J. 2024 Aug 2;23:3081-3089. doi: 10.1016/j.csbj.2024.07.025. eCollection 2024 Dec. Comput Struct Biotechnol J. 2024. PMID: 39185442 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
