Proteins are responsible for performing the vast majority of cellular functions which are critical to a cell's survival. The knowledge of the subcellular localization of proteins can provide valuable information about their molecular functions. Therefore, one of the fundamental goals in cell biology and proteomics is to analyze the subcellular localizations and functions of these proteins. Recent large-scale human genomics and proteomics studies have made it possible to characterize human proteins at a subcellular localization level. In this study, according to the annotation in Swiss-Prot, 8842 human proteins were classified into seven subcellular localizations. Human proteins in the seven subcellular localizations were compared by using topological properties, biological properties, codon usage indices, mRNA expression levels, protein complexity and physicochemical properties. All these properties were found to be significantly different in the seven categories. In addition, based on these properties and pseudo-amino acid compositions, a machine learning classifier was built for the prediction of protein subcellular localization. The study presented here was an attempt to address the aforementioned properties for comparing human proteins of different subcellular localizations. We hope our findings presented in this study may provide important help for the prediction of protein subcellular localization and for understanding the general function of human proteins in cells.
Keywords: Biological properties; Codon usage bias; Expression level; Physicochemical properties; Topological properties.
Copyright © 2014 Elsevier Ltd. All rights reserved.