SVM ensemble based transfer learning for large-scale membrane proteins discrimination

J Theor Biol. 2014 Jan 7:340:105-10. doi: 10.1016/j.jtbi.2013.09.007. Epub 2013 Sep 16.

Abstract

Membrane proteins play important roles in molecular trans-membrane transport, ligand-receptor recognition, cell-cell interaction, enzyme catalysis, host immune defense response and infectious disease pathways. Up to present, discriminating membrane proteins remains a challenging problem from the viewpoints of biological experimental determination and computational modeling. This work presents SVM ensemble based transfer learning model for membrane proteins discrimination (SVM-TLM). To reduce the data constraints on computational modeling, this method investigates the effectiveness of transferring the homolog knowledge to the target membrane proteins under the framework of probability weighted ensemble learning. As compared to multiple kernel learning based transfer learning model, the method takes the advantages of sparseness based SVM optimization on large data, thus more computationally efficient for large protein data analysis. The experiments on large membrane protein benchmark dataset show that SVM-TLM achieves significantly better cross validation performance than the baseline model.

Keywords: Ensemble learning; Large data analysis; Performance overestimation; Protein subcellular localization; Transfer learning.

MeSH terms

  • Algorithms
  • Cell Communication
  • Cell Membrane / metabolism
  • Computational Biology / methods*
  • Computer Simulation
  • Databases, Protein
  • Ligands
  • Membrane Proteins / chemistry*
  • Models, Theoretical
  • Normal Distribution
  • Reproducibility of Results
  • Software
  • Support Vector Machine*
  • Time Factors

Substances

  • Ligands
  • Membrane Proteins