A top-down approach to enhance the power of predicting human protein subcellular localization: Hum-mPLoc 2.0

Anal Biochem. 2009 Nov 15;394(2):269-74. doi: 10.1016/j.ab.2009.07.046. Epub 2009 Aug 3.


Predicting subcellular localization of human proteins is a challenging problem, particularly when query proteins may have a multiplex character, i.e., simultaneously exist at, or move between, two or more different subcellular location sites. In a previous study, we developed a predictor called "Hum-mPLoc" to deal with the multiplex problem for the human protein system. However, Hum-mPLoc has the following shortcomings. (1) The input of accession number for a query protein is required in order to obtain a higher expected success rate by selecting to use the higher-level prediction pathway; but many proteins, such as synthetic and hypothetical proteins as well as those newly discovered proteins without being deposited into databanks yet, do not have accession numbers. (2) Neither functional domain nor sequential evolution information were taken into account in Hum-mPLoc, and hence its power may be reduced accordingly. In view of this, a top-down strategy to address these shortcomings has been implemented. The new predictor thus obtained is called Hum-mPLoc 2.0, where the accession number for input is no longer needed whatsoever. Moreover, both the functional domain information and the sequential evolution information have been fused into the predictor by an ensemble classifier. As a consequence, the prediction power has been significantly enhanced. The web server of Hum-mPLoc2.0 is freely accessible at http://www.csbio.sjtu.edu.cn/bioinf/hum-multi-2/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Protein*
  • Evolution, Molecular
  • Humans
  • Internet
  • Protein Structure, Tertiary / genetics
  • Proteins / genetics*
  • Sequence Analysis, Protein
  • Software Design
  • Subcellular Fractions / chemistry


  • Proteins