Identification of human disease genes from interactome network using graphlet interaction

PLoS One. 2014 Jan 22;9(1):e86142. doi: 10.1371/journal.pone.0086142. eCollection 2014.

Abstract

Identifying genes related to human diseases, such as cancer and cardiovascular disease, etc., is an important task in biomedical research because of its applications in disease diagnosis and treatment. Interactome networks, especially protein-protein interaction networks, had been used to disease genes identification based on the hypothesis that strong candidate genes tend to closely relate to each other in some kinds of measure on the network. We proposed a new measure to analyze the relationship between network nodes which was called graphlet interaction. The graphlet interaction contained 28 different isomers. The results showed that the numbers of the graphlet interaction isomers between disease genes in interactome networks were significantly larger than random picked genes, while graphlet signatures were not. Then, we designed a new type of score, based on the network properties, to identify disease genes using graphlet interaction. The genes with higher scores were more likely to be disease genes, and all candidate genes were ranked according to their scores. Then the approach was evaluated by leave-one-out cross-validation. The precision of the current approach achieved 90% at about 10% recall, which was apparently higher than the previous three predominant algorithms, random walk, Endeavour and neighborhood based method. Finally, the approach was applied to predict new disease genes related to 4 common diseases, most of which were identified by other independent experimental researches. In conclusion, we demonstrate that the graphlet interaction is an effective tool to analyze the network properties of disease genes, and the scores calculated by graphlet interaction is more precise in identifying disease genes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Diabetes Mellitus / genetics
  • Diabetes Mellitus / metabolism
  • Gene Regulatory Networks*
  • Humans
  • Models, Biological*
  • Neoplasms / genetics
  • Neoplasms / metabolism
  • Protein Interaction Maps*
  • Reproducibility of Results

Grants and funding

National Natural Science Foundation of China (Nos. 11232010 and 11222223) and Shanghai Rising-Star Program (No. 11QA1403200). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.