Predicting human protein subcellular locations by the ensemble of multiple predictors via protein-protein interaction network with edge clustering coefficients

PLoS One. 2014 Jan 23;9(1):e86879. doi: 10.1371/journal.pone.0086879. eCollection 2014.

Abstract

One of the fundamental tasks in biology is to identify the functions of all proteins to reveal the primary machinery of a cell. Knowledge of the subcellular locations of proteins will provide key hints to reveal their functions and to understand the intricate pathways that regulate biological processes at the cellular level. Protein subcellular location prediction has been extensively studied in the past two decades. A lot of methods have been developed based on protein primary sequences as well as protein-protein interaction network. In this paper, we propose to use the protein-protein interaction network as an infrastructure to integrate existing sequence based predictors. When predicting the subcellular locations of a given protein, not only the protein itself, but also all its interacting partners were considered. Unlike existing methods, our method requires neither the comprehensive knowledge of the protein-protein interaction network nor the experimentally annotated subcellular locations of most proteins in the protein-protein interaction network. Besides, our method can be used as a framework to integrate multiple predictors. Our method achieved 56% on human proteome in absolute-true rate, which is higher than the state-of-the-art methods.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Computational Biology*
  • Databases, Protein
  • Humans
  • Protein Interaction Maps*
  • Proteins / chemistry*
  • Proteins / metabolism*
  • Proteome / analysis*
  • Software
  • Subcellular Fractions / chemistry*
  • Subcellular Fractions / metabolism*

Substances

  • Proteins
  • Proteome

Grants and funding

This work was supported by National Science Foundation of China (NSFC 61005041), a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (CityU 122511), Specialized Research Fund for the Doctoral Program of Higher Education (SRFDP 20100032120039), Tianjin Science Foundation (No. 12JCQNJC02300), China Postdoctoral Science Foundation (2012T50240 and 2013M530114) and the Seed Foundation of Tianjin University (No. 60302006 and 60302024). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.