Genes that are indispensable for survival are termed essential genes. The analysis and identification of essential genes are very important for understanding the minimal requirements of cellular survival and for practical purposes. Proteins do not exert their function in isolation of one another but rather interact together in PPI networks. A global analysis of protein interaction networks provides an effective way to elucidate the relationships between proteins. With the recent large-scale identifications of essential genes and the production of large amounts of PPIs in humans, we are able to investigate the topological properties and biological properties of essential genes. However, until recently, no one has ever investigated human essential genes using topological and biological properties. In this study, for the first time, 28 topological properties and 22 biological properties were used to investigate the characteristics of essential and non-essential genes in humans. Most of the properties were statistically discriminative between essential and non-essential genes. The F-score was used to estimate the essentiality of each property. The GO-enrichment analysis was performed to investigate the functions of the essential and non-essential genes. Finally, based on the topological features and the biological characteristics, a machine-learning classifier was constructed to predict the essential genes. The results of the jackknife test and 10-fold cross validation test are encouraging, indicating that our classifier is an effective human essential gene discovery method.
Keywords: Human essential genes; Protein–protein interaction; Statistical test; Support vector machine.
Copyright © 2014 Elsevier B.V. All rights reserved.