Comparison of targeted maximum likelihood and shrinkage estimators of parameters in gene networks

Stat Appl Genet Mol Biol. 2012 Sep 25;11(5):Article 2. doi: 10.1515/1544-6115.1728.

Abstract

Gene regulatory networks, in which edges between nodes describe interactions between transcription factors (TFs) and their target genes, model regulatory interactions that determine the cell-type and condition-specific expression of genes. Regression methods can be used to identify TF-target gene interactions from gene expression and DNA sequence data. The response variable, i.e. observed gene expression, is modeled as a function of many predictor variables simultaneously. In practice, it is generally not possible to select a single model that clearly achieves the best fit to the observed experimental data and the selected models typically contain overlapping sets of predictor variables. Moreover, parameters that represent the marginal effect of the individual predictors are not always present. In this paper, we use the statistical framework of estimation of variable importance to define variable importance as a parameter of interest and study two different estimators of this parameter in the context of gene regulatory networks. On yeast data we show that the resulting parameter has a biologically appealing interpretation. We apply the proposed methodology on mammalian gene expression data to gain insight into the temporal activity of TFs that underly gene expression changes in F11 cells in response to Forskolin stimulation.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Gene Expression Profiling / statistics & numerical data
  • Gene Regulatory Networks*
  • Likelihood Functions*
  • Models, Genetic
  • Probability
  • Regression Analysis
  • Transcription Factors / genetics
  • Transcription Factors / metabolism

Substances

  • Transcription Factors