CopulaNet: Learning residue co-evolution directly from multiple sequence alignment for protein structure prediction
- PMID: 33953201
- PMCID: PMC8100175
- DOI: 10.1038/s41467-021-22869-8
CopulaNet: Learning residue co-evolution directly from multiple sequence alignment for protein structure prediction
Abstract
Residue co-evolution has become the primary principle for estimating inter-residue distances of a protein, which are crucially important for predicting protein structure. Most existing approaches adopt an indirect strategy, i.e., inferring residue co-evolution based on some hand-crafted features, say, a covariance matrix, calculated from multiple sequence alignment (MSA) of target protein. This indirect strategy, however, cannot fully exploit the information carried by MSA. Here, we report an end-to-end deep neural network, CopulaNet, to estimate residue co-evolution directly from MSA. The key elements of CopulaNet include: (i) an encoder to model context-specific mutation for each residue; (ii) an aggregator to model residue co-evolution, and thereafter estimate inter-residue distances. Using CASP13 (the 13th Critical Assessment of Protein Structure Prediction) target proteins as representatives, we demonstrate that CopulaNet can predict protein structure with improved accuracy and efficiency. This study represents a step toward improved end-to-end prediction of inter-residue distances and protein tertiary structures.
Conflict of interest statement
The authors declare no competing interests.
Figures
Similar articles
-
Analyzing effect of quadruple multiple sequence alignments on deep learning based protein inter-residue distance prediction.Sci Rep. 2021 Apr 7;11(1):7574. doi: 10.1038/s41598-021-87204-z. Sci Rep. 2021. PMID: 33828153 Free PMC article.
-
Seq-SetNet: directly exploiting multiple sequence alignment for protein secondary structure prediction.Bioinformatics. 2022 Jan 27;38(4):990-996. doi: 10.1093/bioinformatics/btab777. Bioinformatics. 2022. PMID: 34849579
-
Ensembling multiple raw coevolutionary features with deep residual neural networks for contact-map prediction in CASP13.Proteins. 2019 Dec;87(12):1082-1091. doi: 10.1002/prot.25798. Epub 2019 Aug 22. Proteins. 2019. PMID: 31407406 Free PMC article.
-
Recent Applications of Deep Learning Methods on Evolution- and Contact-Based Protein Structure Prediction.Int J Mol Sci. 2021 Jun 2;22(11):6032. doi: 10.3390/ijms22116032. Int J Mol Sci. 2021. PMID: 34199677 Free PMC article. Review.
-
Emerging methods in protein co-evolution.Nat Rev Genet. 2013 Apr;14(4):249-61. doi: 10.1038/nrg3414. Epub 2013 Mar 5. Nat Rev Genet. 2013. PMID: 23458856 Review.
Cited by
-
MoDAFold: a strategy for predicting the structure of missense mutant protein based on AlphaFold2 and molecular dynamics.Brief Bioinform. 2024 Jan 22;25(2):bbae006. doi: 10.1093/bib/bbae006. Brief Bioinform. 2024. PMID: 38305456 Free PMC article.
-
Improved AlphaFold modeling with implicit experimental information.Nat Methods. 2022 Nov;19(11):1376-1382. doi: 10.1038/s41592-022-01645-6. Epub 2022 Oct 20. Nat Methods. 2022. PMID: 36266465 Free PMC article.
-
Deep learning geometrical potential for high-accuracy ab initio protein structure prediction.iScience. 2022 May 18;25(6):104425. doi: 10.1016/j.isci.2022.104425. eCollection 2022 Jun 17. iScience. 2022. PMID: 35663033 Free PMC article.
-
Fast multiple sequence alignment via multi-armed bandits.Bioinformatics. 2024 Jun 28;40(Suppl 1):i328-i336. doi: 10.1093/bioinformatics/btae225. Bioinformatics. 2024. PMID: 38940160 Free PMC article.
-
ProALIGN: Directly Learning Alignments for Protein Structure Prediction via Exploiting Context-Specific Alignment Motifs.J Comput Biol. 2022 Feb;29(2):92-105. doi: 10.1089/cmb.2021.0430. Epub 2022 Jan 21. J Comput Biol. 2022. PMID: 35073170 Free PMC article.
References
-
- Branden, Carl and Tooze, John. Introduction to protein structure. Garland Science, New York, 2 edition, 1 1999.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
