AdaBoost based multi-instance transfer learning for predicting proteome-wide interactions between Salmonella and human proteins

PLoS One. 2014 Oct 17;9(10):e110488. doi: 10.1371/journal.pone.0110488. eCollection 2014.

Abstract

Pathogen-host protein-protein interaction (PPI) plays an important role in revealing the underlying pathogenesis of viruses and bacteria. The need of rapidly mapping proteome-wide pathogen-host interactome opens avenues for and imposes burdens on computational modeling. For Salmonella typhimurium, only 62 interactions with human proteins are reported to date, and the computational modeling based on such a small training data is prone to yield model overfitting. In this work, we propose a multi-instance transfer learning method to reconstruct the proteome-wide Salmonella-human PPI networks, wherein the training data is augmented by homolog knowledge transfer in the form of independent homolog instances. We use AdaBoost instance reweighting to counteract the noise from homolog instances, and deliberately design three experimental settings to validate the assumption that the homolog instances are effective to address the problems of data scarcity and data unavailability. The experimental results show that the proposed method outperforms the existing models and some predictions are validated by the findings from recent literature. Lastly, we conduct gene ontology based clustering analysis of the predicted networks to provide insights into the pathogenesis of Salmonella.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Protein
  • Gene Ontology
  • Host-Pathogen Interactions / genetics*
  • Humans
  • Protein Interaction Maps / genetics*
  • Proteome / genetics
  • Salmonella Infections / genetics*
  • Salmonella Infections / microbiology
  • Salmonella typhimurium / genetics*
  • Salmonella typhimurium / pathogenicity

Substances

  • Proteome

Grants and funding

The work is partly supported by China Postdoctoral Science Foundation Funded Projects (No. 2013M531869, No. 2014T70821). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.