Correlated substitution analysis and the prediction of amino acid structural contacts
- PMID: 18000015
- DOI: 10.1093/bib/bbm052
Correlated substitution analysis and the prediction of amino acid structural contacts
Abstract
It has long been suspected that analysis of correlated amino acid substitutions should uncover pairs or clusters of sites that are spatially proximal in mature protein structures. Accordingly, methods based on different mathematical principles such as information theory, correlation coefficients and maximum likelihood have been developed to identify co-evolving amino acids from multiple sequence alignments. Sets of pairs of sites whose behaviour is identified by these methods as correlated are often significantly enriched in pairs of spatially proximal residues. However, relatively high levels of false-positive predictions typically render such methods, in isolation, of little use in the ab initio prediction of protein structure. Misleading signal (or problems with the estimation of significance levels) can be caused by phylogenetic correlations between homologous sequences and from correlation due to factors other than spatial proximity (for example, correlation of sites which are not spatially close but which are involved in common functional properties of the protein). In recent years, several workers have suggested that information from correlated substitutions should be combined with other sources of information (secondary structure, solvent accessibility, evolutionary rates) in an attempt to reduce the proportion of false-positive predictions. We review methods for the detection of correlated amino acid substitutions, compare their relative performance in contact prediction and predict future directions in the field.
Similar articles
-
Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs.Bioinformatics. 2007 Dec 15;23(24):3320-7. doi: 10.1093/bioinformatics/btm527. Epub 2007 Nov 7. Bioinformatics. 2007. PMID: 17989092
-
Striped sheets and protein contact prediction.Bioinformatics. 2004 Aug 4;20 Suppl 1:i224-31. doi: 10.1093/bioinformatics/bth913. Bioinformatics. 2004. PMID: 15262803
-
Sequence and structural analysis of binding site residues in protein-protein complexes.Int J Biol Macromol. 2010 Mar 1;46(2):187-92. doi: 10.1016/j.ijbiomac.2009.11.009. Epub 2009 Dec 21. Int J Biol Macromol. 2010. PMID: 20026105
-
Potential implications of availability of short amino acid sequences in proteins: an old and new approach to protein decoding and design.Biotechnol Annu Rev. 2008;14:109-41. doi: 10.1016/S1387-2656(08)00004-5. Biotechnol Annu Rev. 2008. PMID: 18606361 Review.
-
Prediction of contacts from correlated sequence substitutions.Curr Opin Struct Biol. 2013 Jun;23(3):473-9. doi: 10.1016/j.sbi.2013.04.001. Epub 2013 May 14. Curr Opin Struct Biol. 2013. PMID: 23680395 Review.
Cited by
-
Accurate simulation and detection of coevolution signals in multiple sequence alignments.PLoS One. 2012;7(10):e47108. doi: 10.1371/journal.pone.0047108. Epub 2012 Oct 16. PLoS One. 2012. PMID: 23091608 Free PMC article.
-
A new ensemble coevolution system for detecting HIV-1 protein coevolution.Biol Direct. 2015 Jan 7;10:1. doi: 10.1186/s13062-014-0031-8. Biol Direct. 2015. PMID: 25564011 Free PMC article.
-
Deep Learning in Protein Structural Modeling and Design.Patterns (N Y). 2020 Nov 12;1(9):100142. doi: 10.1016/j.patter.2020.100142. eCollection 2020 Dec 11. Patterns (N Y). 2020. PMID: 33336200 Free PMC article. Review.
-
Enhanced Inter-helical Residue Contact Prediction in Transmembrane Proteins.Chem Eng Sci. 2011 Oct 1;66(19):4356-4369. doi: 10.1016/j.ces.2011.04.033. Chem Eng Sci. 2011. PMID: 21892227 Free PMC article.
-
Integrated analysis of residue coevolution and protein structure in ABC transporters.PLoS One. 2012;7(5):e36546. doi: 10.1371/journal.pone.0036546. Epub 2012 May 8. PLoS One. 2012. PMID: 22590562 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
