The relationship between protein structure and function: a comprehensive survey with application to the yeast genome
- PMID: 10329133
- DOI: 10.1006/jmbi.1999.2661
The relationship between protein structure and function: a comprehensive survey with application to the yeast genome
Abstract
For most proteins in the genome databases, function is predicted via sequence comparison. In spite of the popularity of this approach, the extent to which it can be reliably applied is unknown. We address this issue by systematically investigating the relationship between protein function and structure. We focus initially on enzymes functionally classified by the Enzyme Commission (EC) and relate these to by structurally classified domains the SCOP database. We find that the major SCOP fold classes have different propensities to carry out certain broad categories of functions. For instance, alpha/beta folds are disproportionately associated with enzymes, especially transferases and hydrolases, and all-alpha and small folds with non-enzymes, while alpha+beta folds have an equal tendency either way. These observations for the database overall are largely true for specific genomes. We focus, in particular, on yeast, analyzing it with many classifications in addition to SCOP and EC (i.e. COGs, CATH, MIPS), and find clear tendencies for fold-function association, across a broad spectrum of functions. Analysis with the COGs scheme also suggests that the functions of the most ancient proteins are more evenly distributed among different structural classes than those of more modern ones. For the database overall, we identify the most versatile functions, i.e. those that are associated with the most folds, and the most versatile folds, associated with the most functions. The two most versatile enzymatic functions (hydro-lyases and O-glycosyl glucosidases) are associated with seven folds each. The five most versatile folds (TIM-barrel, Rossmann, ferredoxin, alpha-beta hydrolase, and P-loop NTP hydrolase) are all mixed alpha-beta structures. They stand out as generic scaffolds, accommodating from six to as many as 16 functions (for the exceptional TIM-barrel). At the conclusion of our analysis we are able to construct a graph giving the chance that a functional annotation can be reliably transferred at different degrees of sequence and structural similarity. Supplemental information is available from http://bioinfo.mbb.yale.edu/genome/foldfunc++ +.
Copyright 1999 Academic Press.
Similar articles
-
Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores.J Mol Biol. 2000 Mar 17;297(1):233-49. doi: 10.1006/jmbi.2000.3550. J Mol Biol. 2000. PMID: 10704319
-
A structural census of genomes: comparing bacterial, eukaryotic, and archaeal genomes in terms of protein structure.J Mol Biol. 1997 Dec 12;274(4):562-76. doi: 10.1006/jmbi.1997.1412. J Mol Biol. 1997. PMID: 9417935
-
Evolution of function in protein superfamilies, from a structural perspective.J Mol Biol. 2001 Apr 6;307(4):1113-43. doi: 10.1006/jmbi.2001.4513. J Mol Biol. 2001. PMID: 11286560
-
Protein folds, functions and evolution.J Mol Biol. 1999 Oct 22;293(2):333-42. doi: 10.1006/jmbi.1999.3054. J Mol Biol. 1999. PMID: 10529349 Review.
-
Contemporary approaches to protein structure classification.Bioessays. 1998 Nov;20(11):884-91. doi: 10.1002/(SICI)1521-1878(199811)20:11<884::AID-BIES3>3.0.CO;2-H. Bioessays. 1998. PMID: 9872054 Review.
Cited by
-
DoSA: Database of Structural Alignments.Database (Oxford). 2013 Jul 11;2013:bat048. doi: 10.1093/database/bat048. Print 2013. Database (Oxford). 2013. PMID: 23846594 Free PMC article.
-
Biochemical functional predictions for protein structures of unknown or uncertain function.Comput Struct Biotechnol J. 2015 Feb 18;13:182-91. doi: 10.1016/j.csbj.2015.02.003. eCollection 2015. Comput Struct Biotechnol J. 2015. PMID: 25848497 Free PMC article. Review.
-
Ballast: a ball-based algorithm for structural motifs.J Comput Biol. 2013 Feb;20(2):137-51. doi: 10.1089/cmb.2012.0246. J Comput Biol. 2013. PMID: 23383999 Free PMC article.
-
Prediction of functional sites based on the fuzzy oil drop model.PLoS Comput Biol. 2007 May;3(5):e94. doi: 10.1371/journal.pcbi.0030094. Epub 2007 Apr 12. PLoS Comput Biol. 2007. PMID: 17530916 Free PMC article.
-
THEMATICS: a simple computational predictor of enzyme function from structure.Proc Natl Acad Sci U S A. 2001 Oct 23;98(22):12473-8. doi: 10.1073/pnas.211436698. Epub 2001 Oct 16. Proc Natl Acad Sci U S A. 2001. PMID: 11606719 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
