Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 110 (14), 5558-63

The Human Gene Connectome as a Map of Short Cuts for Morbid Allele Discovery


The Human Gene Connectome as a Map of Short Cuts for Morbid Allele Discovery

Yuval Itan et al. Proc Natl Acad Sci U S A.


High-throughput genomic data reveal thousands of gene variants per patient, and it is often difficult to determine which of these variants underlies disease in a given individual. However, at the population level, there may be some degree of phenotypic homogeneity, with alterations of specific physiological pathways underlying the pathogenesis of a particular disease. We describe here the human gene connectome (HGC) as a unique approach for human mendelian genetic research, facilitating the interpretation of abundant genetic data from patients with the same disease, and guiding subsequent experimental investigations. We first defined the set of the shortest plausible biological distances, routes, and degrees of separation between all pairs of human genes by applying a shortest distance algorithm to the full human gene network. We then designed a hypothesis-driven application of the HGC, in which we generated a Toll-like receptor 3-specific connectome useful for the genetic dissection of inborn errors of Toll-like receptor 3 immunity. In addition, we developed a functional genomic alignment approach from the HGC. In functional genomic alignment, the genes are clustered according to biological distance (rather than the traditional molecular evolutionary genetic distance), as estimated from the HGC. Finally, we compared the HGC with three state-of-the-art methods: String, FunCoup, and HumanNet. We demonstrated that the existing methods are more suitable for polygenic studies, whereas HGC approaches are more suitable for monogenic studies. The HGC and functional genomic alignment data and computer programs are freely available to noncommercial users from and should facilitate the genome-wide selection of disease-causing candidate alleles for experimental validation.

Conflict of interest statement

Conflict of interest statement: J.-L.C. is a member of the Sanofi Strategic Development and Scientific Advisory Committee.


Fig. 1.
Fig. 1.
(A) The proportions of the various degrees of separation (C) in the HGC. Only 0.086% of all human genes are directly connected (C = 1, data obtained directly from String). The median degree of separation between genes is 4 (39.932% of all connections), 0.041% of genes have a C ≥ 9, and 2.152% of human genes cannot be connected, mostly because they belong to isolated networks of small numbers of genes disconnected from the main human gene network. (B) Box plots displaying the range of biological distance (B) between genes for different degrees of separation C in the HGC. The box represents the 95% confidence interval for randomly sampled gene pairs, the circle represents the median value, the diamond represents the mean value and the vertical line shows the full range from the minimum to the maximum for the specific C value considered. The box on the right shows random sampling from the HGC for all C values, including C ≥ 9.
Fig. 2.
Fig. 2.
Genes within the top 5% of the TLR3 connectome: the 601 human genes with the shortest biological distances to TLR3, as identified from the HGC. The genes are placed in a 2D space (Materials and Methods) and the colors used indicate their unweighted distance from TLR3. The genes in the upper fifth percentile (the outer circle) were assigned a distance of 3.3, for clear visualization. The dashed lines show the predicted shortest plausible biological routes between TLR3 and the 17 (of 21) known TLR3-pathway genes within the top 5% of the TLR3 connectome. The TLR3 pathway genes known to be associated with HSE (all five are within the top 5% of the TLR3 connectome) are indicated by a violet star.
Fig. 3.
Fig. 3.
FGA of the genes in the top 5% of the TLR3 connectome. Based on weighted biological distances between genes, as determined from the HGC, a hierarchical clustering of the genes in the top 5% of the TLR3 connectome was generated and plotted. HSE-associated genes are shown in red, whereas known TLR3-pathway genes not known to be associated with HSE are shown in green. Genes belonging to the same clades as known TLR3-pathway genes are shown in pink. Genes that are not known to be associated with TLR3-pathway or HSE are shown in blue. See Materials and Methods for a detailed description of the FGA approach applied.

Similar articles

See all similar articles

Cited by 35 articles

See all "Cited by" articles

Publication types