Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Feb 5;6(2):e1000662.
doi: 10.1371/journal.pcbi.1000662.

Network-based Elucidation of Human Disease Similarities Reveals Common Functional Modules Enriched for Pluripotent Drug Targets

Free PMC article

Network-based Elucidation of Human Disease Similarities Reveals Common Functional Modules Enriched for Pluripotent Drug Targets

Silpa Suthram et al. PLoS Comput Biol. .
Free PMC article


Current work in elucidating relationships between diseases has largely been based on pre-existing knowledge of disease genes. Consequently, these studies are limited in their discovery of new and unknown disease relationships. We present the first quantitative framework to compare and contrast diseases by an integrated analysis of disease-related mRNA expression data and the human protein interaction network. We identified 4,620 functional modules in the human protein network and provided a quantitative metric to record their responses in 54 diseases leading to 138 significant similarities between diseases. Fourteen of the significant disease correlations also shared common drugs, supporting the hypothesis that similar diseases can be treated by the same drugs, allowing us to make predictions for new uses of existing drugs. Finally, we also identified 59 modules that were dysregulated in at least half of the diseases, representing a common disease-state "signature". These modules were significantly enriched for genes that are known to be drug targets. Interestingly, drugs known to target these genes/proteins are already known to treat significantly more diseases than drugs targeting other genes/proteins, highlighting the importance of these core modules as prime therapeutic opportunities.

Conflict of interest statement

Atul J. Butte is or has served as a scientific advisor and/or consultant to NuMedii, Genstruct, Prevendia, Tercica, Eli Lilly and Company, and Johnson & Johnson.


Figure 1
Figure 1. Overview of the process to generate the module response scores for each disease.
(A) Normalization of the gene expression matrices through a Z-score transformation. In the gene expression matrix for a given disease k, gij represents the expression value of gene i in sample j, gj corresponds to the whole set of gene expression values for a given sample j (jth column) and zij corresponds to the z-score transformed gene expression value of gene i sample j. (B) Response score of a gene in a given disease. The response score of gene i in a disease k is the t-test statistic between the disease and control sample values for that gene. This score is represented as Sik. (C) Module response score calculation. The Module Response Score (MRS) of a given module i in a given disease k (Mik) is average of the response scores of its component genes. Detailed description of this process is provided in the Methods section.
Figure 2
Figure 2. Significant disease-disease similarities.
(A) Hierarchical clustering of the disease correlations. The distance between two diseases was defined to be (1-correlation coefficient) of the two diseases. The tree was constructed using the average method of hierarchical clustering. The red line corresponds to a p-value of 0.01 and FDR of 10.37% and, disease correlations below this line are considered significant. The different colors represent the various categories of significant disease correlations. (B) The network of all the 138 significant disease correlations. The colors correspond to significant disease correlation categories in (A). The nodes colored in grey are not marked in (A).
Figure 3
Figure 3. Underlying functional modules.
(A) Two representative samples of functional modules. (i) The synaptic vesicle module is one of the most down-regulated modules among set of brain disorders: Alzheimer's disease, Bipolar disorder, Schizophrenia and Glioblastoma. (ii) The DNA repair module is one of the most up-regulated modules among the lung cancers and Uterine leioyomyoma. The colors of the nodes represent their average gene expression in their corresponding diseases. The genes marked with a red star next to them are genes with known variants associated with disease (see methods). (B) A representative sample of common disease “signature” modules. The genes colored in orange correspond to known drug targets. The functions for the modules were obtained by the functional enrichment tool of the DAVID database .

Similar articles

See all similar articles

Cited by 133 articles

See all "Cited by" articles


    1. Kalaria R. Similarities between Alzheimer's disease and vascular dementia. J Neurol Sci. 2002;203–204:29–34. - PubMed
    1. Lage K, Karlberg EO, Storling ZM, Olason PI, Pedersen AG, et al. A human phenome-interactome network of protein complexes implicated in genetic disorders. Nat Biotechnol. 2007;25:309–316. - PubMed
    1. Wu X, Liu Q, Jiang R. Align human interactome with phenome to identify causative genes and networks underlying disease families. Bioinformatics. 2009;25:98–104. - PubMed
    1. Tao Y, Sam L, Li J, Friedman C, Lussier YA. Information theory applied to the sparse gene ontology annotation network to predict novel gene function. Bioinformatics. 2007;23:i529–538. - PMC - PubMed
    1. Amberger J, Bocchini CA, Scott AF, Hamosh A. McKusick's Online Mendelian Inheritance in Man (OMIM). Nucleic Acids Res. 2009;37:D793–796. - PMC - PubMed

Publication types

LinkOut - more resources