Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2003 Nov;13(11):2444-9.
doi: 10.1101/gr.1190803. Epub 2003 Oct 14.

Functionality of System Components: Conservation of Protein Function in Protein Feature Space

Affiliations
Free PMC article
Comparative Study

Functionality of System Components: Conservation of Protein Function in Protein Feature Space

Lars Juhl Jensen et al. Genome Res. .
Free PMC article

Abstract

Many protein features useful for prediction of protein function can be predicted from sequence, including posttranslational modifications, subcellular localization, and physical/chemical properties. We show here that such protein features are more conserved among orthologs than paralogs, indicating they are crucial for protein function and thus subject to selective pressure. This means that a function prediction method based on sequence-derived features may be able to discriminate between proteins with different function even when they have highly similar structure. Also, such a method is likely to perform well on organisms other than the one on which it was trained. We evaluate the performance of such a method, ProtFun, which relies on protein features as its sole input, and show that the method gives similar performance for most eukaryotes and performs much better than anticipated on archaea and bacteria. From this analysis, we conclude that for the posttranslational modifications studied, both the cellular use and the sequence motifs are conserved within Eukarya.

Figures

Figure 1
Figure 1
Estimated probability for same cellular role as function of similarity for orthologs and paralogs. These probabilities were estimated as the overlap integral of the ProtFun predictions for H. sapiens and D. melanogaster proteins involved in each pair. The probabilities could not be reliably estimated outside the range 30%-80% identity as orthology versus paralogy cannot be reliably predicted for distant homologs and because very closely related paralogs are likely predicted to be orthologs.
Figure 2
Figure 2
ProtFun performance for functional classes and performance contributions from input features. For 44 organisms the area under the receiver output characteristic (ROC) curve has been plotted for all cellular role categories and enzyme classes (left panel). These performances were mapped into input features based on the feature usage matrix (see Fig. 1 in Jensen et al. 2002).

Similar articles

See all similar articles

Cited by 13 articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback