PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments
- PMID: 22101153
- DOI: 10.1093/bioinformatics/btr638
PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments
Abstract
Motivation: The accurate prediction of residue-residue contacts, critical for maintaining the native fold of a protein, remains an open problem in the field of structural bioinformatics. Interest in this long-standing problem has increased recently with algorithmic improvements and the rapid growth in the sizes of sequence families. Progress could have major impacts in both structure and function prediction to name but two benefits. Sequence-based contact predictions are usually made by identifying correlated mutations within multiple sequence alignments (MSAs), most commonly through the information-theoretic approach of calculating mutual information between pairs of sites in proteins. These predictions are often inaccurate because the true covariation signal in the MSA is often masked by biases from many ancillary indirect-coupling or phylogenetic effects. Here we present a novel method, PSICOV, which introduces the use of sparse inverse covariance estimation to the problem of protein contact prediction. Our method builds on work which had previously demonstrated corrections for phylogenetic and entropic correlation noise and allows accurate discrimination of direct from indirectly coupled mutation correlations in the MSA.
Results: PSICOV displays a mean precision substantially better than the best performing normalized mutual information approach and Bayesian networks. For 118 out of 150 targets, the L/5 (i.e. top-L/5 predictions for a protein of length L) precision for long-range contacts (sequence separation >23) was ≥ 0.5, which represents an improvement sufficient to be of significant benefit in protein structure prediction or model quality assessment.
Availability: The PSICOV source code can be downloaded from http://bioinf.cs.ucl.ac.uk/downloads/PSICOV.
Similar articles
-
MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins.Bioinformatics. 2015 Apr 1;31(7):999-1006. doi: 10.1093/bioinformatics/btu791. Epub 2014 Nov 26. Bioinformatics. 2015. PMID: 25431331 Free PMC article.
-
COUSCOus: improved protein contact prediction using an empirical Bayes covariance estimator.BMC Bioinformatics. 2016 Dec 15;17(1):533. doi: 10.1186/s12859-016-1400-3. BMC Bioinformatics. 2016. PMID: 27978812 Free PMC article.
-
PconsC: combination of direct information methods and alignments improves contact prediction.Bioinformatics. 2013 Jul 15;29(14):1815-6. doi: 10.1093/bioinformatics/btt259. Epub 2013 May 8. Bioinformatics. 2013. PMID: 23658418
-
Improving residue-residue contact prediction via low-rank and sparse decomposition of residue correlation matrix.Biochem Biophys Res Commun. 2016 Mar 25;472(1):217-22. doi: 10.1016/j.bbrc.2016.01.188. Epub 2016 Feb 23. Biochem Biophys Res Commun. 2016. PMID: 26920058
-
Improving accuracy of protein contact prediction using balanced network deconvolution.Proteins. 2015 Mar;83(3):485-96. doi: 10.1002/prot.24744. Epub 2015 Jan 24. Proteins. 2015. PMID: 25524593 Free PMC article.
Cited by
-
Enhancing coevolutionary signals in protein-protein interaction prediction through clade-wise alignment integration.Sci Rep. 2024 Mar 12;14(1):6009. doi: 10.1038/s41598-024-55655-9. Sci Rep. 2024. PMID: 38472223
-
Drug-target affinity prediction with extended graph learning-convolutional networks.BMC Bioinformatics. 2024 Feb 16;25(1):75. doi: 10.1186/s12859-024-05698-6. BMC Bioinformatics. 2024. PMID: 38365583 Free PMC article.
-
Key interaction networks: Identifying evolutionarily conserved non-covalent interaction networks across protein families.Protein Sci. 2024 Mar;33(3):e4911. doi: 10.1002/pro.4911. Protein Sci. 2024. PMID: 38358258 Free PMC article.
-
Review of Computational Methods and Database Sources for Predicting the Effects of Coding Frameshift Small Insertion and Deletion Variations.ACS Omega. 2024 Jan 3;9(2):2032-2047. doi: 10.1021/acsomega.3c07662. eCollection 2024 Jan 16. ACS Omega. 2024. PMID: 38250421 Free PMC article. Review.
-
PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features.Sci Rep. 2023 Nov 28;13(1):20882. doi: 10.1038/s41598-023-47624-5. Sci Rep. 2023. PMID: 38016996 Free PMC article.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
