Transcriptional gene network inference from a massive dataset elucidates transcriptome organization and gene function

Nucleic Acids Res. 2011 Nov 1;39(20):8677-88. doi: 10.1093/nar/gkr593. Epub 2011 Jul 23.


We collected a massive and heterogeneous dataset of 20 255 gene expression profiles (GEPs) from a variety of human samples and experimental conditions, as well as 8895 GEPs from mouse samples. We developed a mutual information (MI) reverse-engineering approach to quantify the extent to which the mRNA levels of two genes are related to each other across the dataset. The resulting networks consist of 4 817 629 connections among 20 255 transcripts in human and 14 461 095 connections among 45 101 transcripts in mouse, with a inter-species conservation of 12%. The inferred connections were compared against known interactions to assess their biological significance. We experimentally validated a subset of not previously described protein-protein interactions. We discovered co-expressed modules within the networks, consisting of genes strongly connected to each other, which carry out specific biological functions, and tend to be in physical proximity at the chromatin level in the nucleus. We show that the network can be used to predict the biological function and subcellular localization of a protein, and to elucidate the function of a disease gene. We experimentally verified that granulin precursor (GRN) gene, whose mutations cause frontotemporal lobar degeneration, is involved in lysosome function. We have developed an online tool to explore the human and mouse gene networks.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Gene Expression Profiling
  • Gene Regulatory Networks*
  • HeLa Cells
  • Humans
  • Intercellular Signaling Peptides and Proteins / genetics
  • Lysosomes / ultrastructure
  • Mice
  • Progranulins
  • Protein Interaction Maps
  • Transcriptome*


  • Intercellular Signaling Peptides and Proteins
  • Progranulins