Improved functional prediction of proteins by learning kernel combinations in multilabel settings

BMC Bioinformatics. 2007 May 3;8 Suppl 2(Suppl 2):S12. doi: 10.1186/1471-2105-8-S2-S12.

Abstract

Background: We develop a probabilistic model for combining kernel matrices to predict the function of proteins. It extends previous approaches in that it can handle multiple labels which naturally appear in the context of protein function.

Results: Explicit modeling of multilabels significantly improves the capability of learning protein function from multiple kernels. The performance and the interpretability of the inference model are further improved by simultaneously predicting the subcellular localization of proteins and by combining pairwise classifiers to consistent class membership estimates.

Conclusion: For the purpose of functional prediction of proteins, multilabels provide valuable information that should be included adequately in the training process of classifiers. Learning of functional categories gains from co-prediction of subcellular localization. Pairwise separation rules allow very detailed insights into the relevance of different measurements like sequence, structure, interaction data, or expression data. A preliminary version of the software can be downloaded from http://www.inf.ethz.ch/personal/vroth/KernelHMM/.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Artificial Intelligence*
  • Computer Simulation
  • Fungal Proteins / chemistry*
  • Fungal Proteins / classification
  • Fungal Proteins / metabolism*
  • Models, Biological*
  • Sequence Analysis, Protein / methods*
  • Signal Transduction / physiology*
  • Structure-Activity Relationship
  • Yeasts / metabolism*

Substances

  • Fungal Proteins