Automatic Prediction of Protein 3D Structures by Probabilistic Multi-template Homology Modeling
- PMID: 26496371
- PMCID: PMC4619893
- DOI: 10.1371/journal.pcbi.1004343
Automatic Prediction of Protein 3D Structures by Probabilistic Multi-template Homology Modeling
Abstract
Homology modeling predicts the 3D structure of a query protein based on the sequence alignment with one or more template proteins of known structure. Its great importance for biological research is owed to its speed, simplicity, reliability and wide applicability, covering more than half of the residues in protein sequence space. Although multiple templates have been shown to generally increase model quality over single templates, the information from multiple templates has so far been combined using empirically motivated, heuristic approaches. We present here a rigorous statistical framework for multi-template homology modeling. First, we find that the query proteins' atomic distance restraints can be accurately described by two-component Gaussian mixtures. This insight allowed us to apply the standard laws of probability theory to combine restraints from multiple templates. Second, we derive theoretically optimal weights to correct for the redundancy among related templates. Third, a heuristic template selection strategy is proposed. We improve the average GDT-ha model quality score by 11% over single template modeling and by 6.5% over a conventional multi-template approach on a set of 1000 query proteins. Robustness with respect to wrong constraints is likewise improved. We have integrated our multi-template modeling approach with the popular MODELLER homology modeling software in our free HHpred server http://toolkit.tuebingen.mpg.de/hhpred and also offer open source software for running MODELLER with the new restraints at https://bitbucket.org/soedinglab/hh-suite.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures
Similar articles
-
An evaluation of automated homology modelling methods at low target template sequence similarity.Bioinformatics. 2007 Aug 1;23(15):1901-8. doi: 10.1093/bioinformatics/btm262. Epub 2007 May 17. Bioinformatics. 2007. PMID: 17510171
-
Comparative modeling without implicit sequence alignments.Bioinformatics. 2007 Oct 1;23(19):2522-7. doi: 10.1093/bioinformatics/btm380. Epub 2007 Jul 27. Bioinformatics. 2007. PMID: 17660201
-
The HHpred interactive server for protein homology detection and structure prediction.Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W244-8. doi: 10.1093/nar/gki408. Nucleic Acids Res. 2005. PMID: 15980461 Free PMC article.
-
An introduction to modeling structure from sequence.Curr Protoc Bioinformatics. 2006 Oct;Chapter 5:Unit 5.1. doi: 10.1002/0471250953.bi0501s15. Curr Protoc Bioinformatics. 2006. PMID: 18428765 Review.
-
Toward the solution of the protein structure prediction problem.J Biol Chem. 2021 Jul;297(1):100870. doi: 10.1016/j.jbc.2021.100870. Epub 2021 Jun 11. J Biol Chem. 2021. PMID: 34119522 Free PMC article. Review.
Cited by
-
Structure-based virtual screening and molecular dynamics of potential inhibitors targeting sodium-bile acid co-transporter of carcinogenic liver fluke Clonorchis sinensis.PLoS Negl Trop Dis. 2022 Nov 9;16(11):e0010909. doi: 10.1371/journal.pntd.0010909. eCollection 2022 Nov. PLoS Negl Trop Dis. 2022. PMID: 36350897 Free PMC article.
-
LOMETS2: improved meta-threading server for fold-recognition and structure-based function annotation for distant-homology proteins.Nucleic Acids Res. 2019 Jul 2;47(W1):W429-W436. doi: 10.1093/nar/gkz384. Nucleic Acids Res. 2019. PMID: 31081035 Free PMC article.
-
Structural insights into the main S-layer unit of Deinococcus radiodurans reveal a massive protein complex with porin-like features.J Biol Chem. 2020 Mar 27;295(13):4224-4236. doi: 10.1074/jbc.RA119.012174. Epub 2020 Feb 18. J Biol Chem. 2020. PMID: 32071085 Free PMC article.
-
Connective Tissue Growth Factor: From Molecular Understandings to Drug Discovery.Front Cell Dev Biol. 2020 Oct 29;8:593269. doi: 10.3389/fcell.2020.593269. eCollection 2020. Front Cell Dev Biol. 2020. PMID: 33195264 Free PMC article. Review.
-
The Double-Stranded DNA Virosphere as a Modular Hierarchical Network of Gene Sharing.mBio. 2016 Aug 2;7(4):e00978-16. doi: 10.1128/mBio.00978-16. mBio. 2016. PMID: 27486193 Free PMC article.
References
-
- Mariani V, Kiefer F, Schmidt T, Haas J, Schwede T (2011) Assessment of template based protein structure predictions in CASP9. Proteins 79 Suppl 1: 37–58. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
