Direct-coupling analysis of residue coevolution captures native contacts across many protein families
- PMID: 22106262
- PMCID: PMC3241805
- DOI: 10.1073/pnas.1111471108
Direct-coupling analysis of residue coevolution captures native contacts across many protein families
Abstract
The similarity in the three-dimensional structures of homologous proteins imposes strong constraints on their sequence variability. It has long been suggested that the resulting correlations among amino acid compositions at different sequence positions can be exploited to infer spatial contacts within the tertiary protein structure. Crucial to this inference is the ability to disentangle direct and indirect correlations, as accomplished by the recently introduced direct-coupling analysis (DCA). Here we develop a computationally efficient implementation of DCA, which allows us to evaluate the accuracy of contact prediction by DCA for a large number of protein domains, based purely on sequence information. DCA is shown to yield a large number of correctly predicted contacts, recapitulating the global structure of the contact map for the majority of the protein domains examined. Furthermore, our analysis captures clear signals beyond intradomain residue contacts, arising, e.g., from alternative protein conformations, ligand-mediated residue couplings, and interdomain interactions in protein oligomers. Our findings suggest that contacts predicted by DCA can be used as a reliable guide to facilitate computational predictions of alternative protein conformations, protein complex formation, and even the de novo prediction of protein domain structures, contingent on the existence of a large number of homologous sequences which are being rapidly made available due to advances in genome sequencing.
Conflict of interest statement
The authors declare no conflict of interest.
Figures
Similar articles
-
Direct coupling analysis for protein contact prediction.Methods Mol Biol. 2014;1137:55-70. doi: 10.1007/978-1-4939-0366-5_5. Methods Mol Biol. 2014. PMID: 24573474
-
Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era.Proc Natl Acad Sci U S A. 2013 Sep 24;110(39):15674-9. doi: 10.1073/pnas.1314045110. Epub 2013 Sep 5. Proc Natl Acad Sci U S A. 2013. PMID: 24009338 Free PMC article.
-
Coevolutionary Signals and Structure-Based Models for the Prediction of Protein Native Conformations.Methods Mol Biol. 2019;1851:83-103. doi: 10.1007/978-1-4939-8736-8_5. Methods Mol Biol. 2019. PMID: 30298393
-
Prediction of Structures and Interactions from Genome Information.Adv Exp Med Biol. 2018;1105:123-152. doi: 10.1007/978-981-13-2200-6_9. Adv Exp Med Biol. 2018. PMID: 30617827 Review.
-
Gleaning structural and functional information from correlations in protein multiple sequence alignments.Curr Opin Struct Biol. 2016 Jun;38:1-8. doi: 10.1016/j.sbi.2016.04.006. Epub 2016 May 12. Curr Opin Struct Biol. 2016. PMID: 27179293 Review.
Cited by
-
Protein A-like Peptide Design Based on Diffusion and ESM2 Models.Molecules. 2024 Oct 21;29(20):4965. doi: 10.3390/molecules29204965. Molecules. 2024. PMID: 39459333 Free PMC article.
-
Deep learning techniques have significantly impacted protein structure prediction and protein design.Curr Opin Struct Biol. 2021 Jun;68:194-207. doi: 10.1016/j.sbi.2021.01.007. Epub 2021 Feb 24. Curr Opin Struct Biol. 2021. PMID: 33639355 Free PMC article. Review.
-
Improved Prediction of Bacterial Genotype-Phenotype Associations Using Interpretable Pangenome-Spanning Regressions.mBio. 2020 Jul 7;11(4):e01344-20. doi: 10.1128/mBio.01344-20. mBio. 2020. PMID: 32636251 Free PMC article.
-
Singular value decomposition of protein sequences as a method to visualize sequence and residue space.Protein Sci. 2022 Oct;31(10):e4422. doi: 10.1002/pro.4422. Protein Sci. 2022. PMID: 36173173 Free PMC article.
-
Toward the solution of the protein structure prediction problem.J Biol Chem. 2021 Jul;297(1):100870. doi: 10.1016/j.jbc.2021.100870. Epub 2021 Jun 11. J Biol Chem. 2021. PMID: 34119522 Free PMC article. Review.
References
-
- Altschuh D, Lesk A, Bloomer A, Klug A. Correlation of co-ordinated amino acid substitutions with function in viruses related to tobacco mosaic virus. J Mol Biol. 1987;193:693–707. - PubMed
-
- Göbel U, Sander C, Schneider R, Valencia A. Correlated mutations and residue contacts in proteins. Proteins Struct Funct Genet. 1994;18:309–317. - PubMed
-
- Shindyalov IN, Kolchanov NA, Sander C. Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations? Protein Eng. 1994;7:349–358. - PubMed
-
- Lockless SW, Ranganathan R. Evolutionarily conserved pathways of energetic connectivity in protein families. Science. 1999;286:295–299. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
