Biological feature validation of estimated gene interaction networks from microarray data: a case study on MYC in lymphomas

Brief Bioinform. 2011 May;12(3):230-44. doi: 10.1093/bib/bbr007. Epub 2011 Apr 15.

Abstract

Gene expression is a dynamic process where thousands of components interact dynamically in a complex way. A major goal in systems biology/medicine is to reconstruct the network of components from microarray data. Here, we address two key aspects of network reconstruction: (i) ergodicity supports the interpretation of the measured data as time averages and (ii) confounding is an important aspect of network reconstruction. To elucidate these aspects, we explore a data set of 214 lymphoma patients with translocated or normal MYC gene. MYC (c-Myc) translocations to immunoglobulin heavy-chain (IGH@) or light-chain (IGK@, IGL@) loci lead to c-Myc overexpression and are widely believed to be the crucial initiating oncogenic events. There is a rich body of knowledge on the biological implications of the different translocations. In the context of these data, the article reflects the relationship between the biological knowledge and the results of formal statistical estimates of gene interaction networks. The article identifies key steps to provide a trustworthy biological feature validation: (i) analysing a medium-sized network as a subnet of a more extensive environment to avoid bias by confounding, (ii) the use of external data to demonstrate the stability and reproducibility of the derived structures, (iii) a systematic literature review on the relevant issue, (iv) use of structured knowledge from databases to support the derived findings and (v) a strategy for biological experiments derived from the findings in steps (i-iv).

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Computational Biology
  • Databases, Genetic
  • Gene Expression Profiling*
  • Gene Regulatory Networks / genetics*
  • Genes, myc*
  • Humans
  • Lymphoma / genetics*
  • Oligonucleotide Array Sequence Analysis
  • Systems Biology