Background: Cell responses to environmental stimuli are usually organized as relatively separate responsive gene modules at the molecular level. Identification of responsive gene modules rather than individual differentially expressed (DE) genes will provide important information about the underlying molecular mechanisms. Most of current methods formulate module identification as an optimization problem: find the active sub-networks in the genome-wide gene network by maximizing the objective function considering the gene differential expression and/or the gene-gene co-expression information. Here we presented a new formulation of this task: a group of closely-connected and co-expressed DE genes in the gene network are regarded as the signatures of the underlying responsive gene modules; the modules can be identified by finding the signatures and then recovering the "missing parts" by adding the intermediate genes that connect the DE genes in the gene network.
Results: ClustEx, a two-step method based on the new formulation, was developed and applied to identify the responsive gene modules of human umbilical vein endothelial cells (HUVECs) in inflammation and angiogenesis models by integrating the time-course microarray data and genome-wide PPI data. It shows better performance than several available module identification tools by testing on the reference responsive gene sets. Gene set analysis of KEGG pathways, GO terms and microRNAs (miRNAs) target gene sets further supports the ClustEx predictions.
Conclusion: Taking the closely-connected and co-expressed DE genes in the condition-specific gene network as the signatures of the underlying responsive gene modules provides a new strategy to solve the module identification problem. The identified responsive gene modules of HUVECs and the corresponding enriched pathways/miRNAs provide useful resources for understanding the inflammatory and angiogenic responses of vascular systems.