A generic method for assignment of reliability scores applied to solvent accessibility predictions
- PMID: 19646261
- PMCID: PMC2725087
- DOI: 10.1186/1472-6807-9-51
A generic method for assignment of reliability scores applied to solvent accessibility predictions
Abstract
Background: Estimation of the reliability of specific real value predictions is nontrivial and the efficacy of this is often questionable. It is important to know if you can trust a given prediction and therefore the best methods associate a prediction with a reliability score or index. For discrete qualitative predictions, the reliability is conventionally estimated as the difference between output scores of selected classes. Such an approach is not feasible for methods that predict a biological feature as a single real value rather than a classification. As a solution to this challenge, we have implemented a method that predicts the relative surface accessibility of an amino acid and simultaneously predicts the reliability for each prediction, in the form of a Z-score.
Results: An ensemble of artificial neural networks has been trained on a set of experimentally solved protein structures to predict the relative exposure of the amino acids. The method assigns a reliability score to each surface accessibility prediction as an inherent part of the training process. This is in contrast to the most commonly used procedures where reliabilities are obtained by post-processing the output.
Conclusion: The performance of the neural networks was evaluated on a commonly used set of sequences known as the CB513 set. An overall Pearson's correlation coefficient of 0.72 was obtained, which is comparable to the performance of the currently best public available method, Real-SPINE. Both methods associate a reliability score with the individual predictions. However, our implementation of reliability scores in the form of a Z-score is shown to be the more informative measure for discriminating good predictions from bad ones in the entire range from completely buried to fully exposed amino acids. This is evident when comparing the Pearson's correlation coefficient for the upper 20% of predictions sorted according to reliability. For this subset, values of 0.79 and 0.74 are obtained using our and the compared method, respectively. This tendency is true for any selected subset.
Figures
Similar articles
-
SVM-Cabins: prediction of solvent accessibility using accumulation cutoff set and support vector machine.Proteins. 2007 Jul 1;68(1):82-91. doi: 10.1002/prot.21422. Proteins. 2007. PMID: 17436325
-
Real value prediction of protein solvent accessibility using enhanced PSSM features.BMC Bioinformatics. 2008 Dec 12;9 Suppl 12(Suppl 12):S12. doi: 10.1186/1471-2105-9-S12-S12. BMC Bioinformatics. 2008. PMID: 19091011 Free PMC article.
-
Accurate prediction of solvent accessibility using neural networks-based regression.Proteins. 2004 Sep 1;56(4):753-67. doi: 10.1002/prot.20176. Proteins. 2004. PMID: 15281128
-
Improved protein relative solvent accessibility prediction using deep multi-view feature learning framework.Anal Biochem. 2021 Oct 15;631:114358. doi: 10.1016/j.ab.2021.114358. Epub 2021 Aug 31. Anal Biochem. 2021. PMID: 34478704
-
Sann: solvent accessibility prediction of proteins by nearest neighbor method.Proteins. 2012 Jul;80(7):1791-7. doi: 10.1002/prot.24074. Epub 2012 May 8. Proteins. 2012. PMID: 22434533
Cited by
-
A machine learning strategy for predicting localization of post-translational modification sites in protein-protein interacting regions.BMC Bioinformatics. 2016 Aug 17;17(1):307. doi: 10.1186/s12859-016-1165-8. BMC Bioinformatics. 2016. PMID: 27534850 Free PMC article.
-
Identification and quantification of S-nitrosylation by cysteine reactive tandem mass tag switch assay.Mol Cell Proteomics. 2012 Feb;11(2):M111.013441. doi: 10.1074/mcp.M111.013441. Epub 2011 Nov 29. Mol Cell Proteomics. 2012. PMID: 22126794 Free PMC article.
-
Structural modeling of the N-terminal signal-receiving domain of IκBα.Front Mol Biosci. 2015 Jun 23;2:32. doi: 10.3389/fmolb.2015.00032. eCollection 2015. Front Mol Biosci. 2015. PMID: 26157801 Free PMC article.
-
Impacts of Nonsynonymous Single Nucleotide Polymorphisms of Adiponectin Receptor 1 Gene on Corresponding Protein Stability: A Computational Approach.Biomed Res Int. 2016;2016:9142190. doi: 10.1155/2016/9142190. Epub 2016 May 15. Biomed Res Int. 2016. PMID: 27294143 Free PMC article.
-
Acetylome analysis reveals the involvement of lysine acetylation in diverse biological processes in Phytophthora sojae.Sci Rep. 2016 Jul 14;6:29897. doi: 10.1038/srep29897. Sci Rep. 2016. PMID: 27412925 Free PMC article.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
