Bayesian network expansion identifies new ROS and biofilm regulators

PLoS One. 2010 Mar 3;5(3):e9513. doi: 10.1371/journal.pone.0009513.

Abstract

Signaling and regulatory pathways that guide gene expression have only been partially defined for most organisms. However, given the increasing number of microarray measurements, it may be possible to reconstruct such pathways and uncover missing connections directly from experimental data. Using a compendium of microarray gene expression data obtained from Escherichia coli, we constructed a series of Bayesian network models for the reactive oxygen species (ROS) pathway as defined by EcoCyc. A consensus Bayesian network model was generated using those networks sharing the top recovered score. This microarray-based network only partially agreed with the known ROS pathway curated from the literature and databases. A top network was then expanded to predict genes that could enhance the Bayesian network model using an algorithm we termed 'BN+1'. This expansion procedure predicted many stress-related genes (e.g., dusB and uspE), and their possible interactions with other ROS pathway genes. A term enrichment method discovered that biofilm-associated microarray data usually contained high expression levels of both uspE and gadX. The predicted involvement of gene uspE in the ROS pathway and interactions between uspE and gadX were confirmed experimentally using E. coli reporter strains. Genes gadX and uspE showed a feedback relationship in regulating each other's expression. Both genes were verified to regulate biofilm formation through gene knockout experiments. These data suggest that the BN+1 expansion method can faithfully uncover hidden or unknown genes for a selected pathway with significant biological roles. The presently reported BN+1 expansion method is a generalized approach applicable to the characterization and expansion of other biological pathways and living systems.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • AraC Transcription Factor / biosynthesis
  • AraC Transcription Factor / genetics*
  • Bacterial Proteins / metabolism
  • Bayes Theorem
  • Biofilms*
  • Computational Biology / methods
  • Escherichia coli / genetics*
  • Escherichia coli Proteins / biosynthesis
  • Escherichia coli Proteins / genetics*
  • Gene Expression Profiling
  • Gene Expression Regulation, Bacterial*
  • Gene Regulatory Networks
  • Oligonucleotide Array Sequence Analysis
  • Plasmids / metabolism
  • Reactive Oxygen Species*
  • Signal Transduction

Substances

  • AraC Transcription Factor
  • Bacterial Proteins
  • Escherichia coli Proteins
  • GadX protein, E coli
  • Reactive Oxygen Species