Highly sensitive detection of individual HEAT and ARM repeats with HHpred and COACH
- PMID: 19777061
- PMCID: PMC2744927
- DOI: 10.1371/journal.pone.0007148
Highly sensitive detection of individual HEAT and ARM repeats with HHpred and COACH
Abstract
Background: HEAT and ARM repeats occur in a large number of eukaryotic proteins. As these repeats are often highly diverged, the prediction of HEAT or ARM domains can be challenging. Except for the most clear-cut cases, identification at the individual repeat level is indispensable, in particular for determining domain boundaries. However, methods using single sequence queries do not have the sensitivity required to deal with more divergent repeats and, when applied to proteins with known structures, in some cases failed to detect a single repeat.
Methodology and principal findings: Testing algorithms which use multiple sequence alignments as queries, we found two of them, HHpred and COACH, to detect HEAT and ARM repeats with greatly enhanced sensitivity. Calibration against experimentally determined structures suggests the use of three score classes with increasing confidence in the prediction, and prediction thresholds for each method. When we applied a new protocol using both HHpred and COACH to these structures, it detected 82% of HEAT repeats and 90% of ARM repeats, with the minimum for a given protein of 57% for HEAT repeats and 60% for ARM repeats. Application to bona fide HEAT and ARM proteins or domains indicated that similar numbers can be expected for the full complement of HEAT/ARM proteins. A systematic screen of the Protein Data Bank for false positive hits revealed their number to be low, in particular for ARM repeats. Double false positive hits for a given protein were rare for HEAT and not at all observed for ARM repeats. In combination with fold prediction and consistency checking (multiple sequence alignments, secondary structure prediction, and position analysis), repeat prediction with the new HHpred/COACH protocol dramatically improves prediction in the twilight zone of fold prediction methods, as well as the delineation of HEAT/ARM domain boundaries.
Significance: A protocol is presented for the identification of individual HEAT or ARM repeats which is straightforward to implement. It provides high sensitivity at a low false positive rate and will therefore greatly enhance the accuracy of predictions of HEAT and ARM domains.
Conflict of interest statement
Figures
Similar articles
-
Insights into the domain and repeat architecture of target of rapamycin.J Struct Biol. 2010 May;170(2):354-63. doi: 10.1016/j.jsb.2010.01.002. Epub 2010 Jan 11. J Struct Biol. 2010. PMID: 20060908 Free PMC article.
-
Local structure-based sequence profile database for local and global protein structure predictions.Bioinformatics. 2002 Dec;18(12):1650-7. doi: 10.1093/bioinformatics/18.12.1650. Bioinformatics. 2002. PMID: 12490450
-
CATHEDRAL: a fast and effective algorithm to predict folds and domain boundaries from multidomain protein structures.PLoS Comput Biol. 2007 Nov;3(11):e232. doi: 10.1371/journal.pcbi.0030232. PLoS Comput Biol. 2007. PMID: 18052539 Free PMC article.
-
Comparison of ARM and HEAT protein repeats.J Mol Biol. 2001 May 25;309(1):1-18. doi: 10.1006/jmbi.2001.4624. J Mol Biol. 2001. PMID: 11491282 Review.
-
State-of-the-art bioinformatics protein structure prediction tools (Review).Int J Mol Med. 2011 Sep;28(3):295-310. doi: 10.3892/ijmm.2011.705. Epub 2011 May 23. Int J Mol Med. 2011. PMID: 21617841 Review.
Cited by
-
Rpn1 and Rpn2 coordinate ubiquitin processing factors at proteasome.J Biol Chem. 2012 Apr 27;287(18):14659-71. doi: 10.1074/jbc.M111.316323. Epub 2012 Feb 8. J Biol Chem. 2012. PMID: 22318722 Free PMC article.
-
X-ray crystal structure of the UCS domain-containing UNC-45 myosin chaperone from Drosophila melanogaster.Structure. 2011 Mar 9;19(3):397-408. doi: 10.1016/j.str.2011.01.002. Structure. 2011. PMID: 21397190 Free PMC article.
-
Deoxyhypusine hydroxylase from Plasmodium vivax, the neglected human malaria parasite: molecular cloning, expression and specific inhibition by the 5-LOX inhibitor zileuton.PLoS One. 2013;8(3):e58318. doi: 10.1371/journal.pone.0058318. Epub 2013 Mar 7. PLoS One. 2013. PMID: 23505486 Free PMC article.
-
Molecular Characterization and Immuno-Reactivity Patterns of a Novel Plasmodium falciparum Armadillo-Type Repeat Protein, PfATRP.Front Cell Infect Microbiol. 2020 Mar 20;10:114. doi: 10.3389/fcimb.2020.00114. eCollection 2020. Front Cell Infect Microbiol. 2020. PMID: 32266165 Free PMC article.
-
Insights into the domain and repeat architecture of target of rapamycin.J Struct Biol. 2010 May;170(2):354-63. doi: 10.1016/j.jsb.2010.01.002. Epub 2010 Jan 11. J Struct Biol. 2010. PMID: 20060908 Free PMC article.
References
-
- Aravind L, Iyer LM, Koonin EV. Comparative genomics and structural biology of the molecular innovations of eukaryotes. Curr Opin Struct Biol. 2006;16:409–419. - PubMed
-
- Andrade MA, Perez-Iratxeta C, Ponting CP. Protein repeats: structures, functions, and evolution. J Struct Biol. 2001;134:117–131. - PubMed
-
- Andrade MA, Petosa C, O'Donoghue SI, Muller CW, Bork P. Comparison of ARM and HEAT protein repeats. J Mol Biol. 2001;25:1–18. - PubMed
-
- Andrade MA, Ponting CP, Gibson TJ, Bork P. Homology-based method for identification of protein repeats using statistical significance estimates. J Mol Biol. 2000;298:521–537. - PubMed
-
- Hemmings BA, Adams-Pearson C, Maurer F, Müller P, Goris J, et al. alpha- and beta-forms of the 65-kDa subunit of protein phosphatase 2A have a similar 39 amino acid repeating structure. Biochemistry. 1990;29:3166–3173. - PubMed
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
