Genome-scale analysis of the uses of the Escherichia coli genome: model-driven analysis of heterogeneous data sets

Timothy E Allen; Markus J Herrgård; Mingzhu Liu; Yu Qiu; Jeremy D Glasner; Frederick R Blattner; Bernhard Ø Palsson

doi:10.1128/JB.185.21.6392-6399.2003

Genome-scale analysis of the uses of the Escherichia coli genome: model-driven analysis of heterogeneous data sets

J Bacteriol. 2003 Nov;185(21):6392-9. doi: 10.1128/JB.185.21.6392-6399.2003.

Authors

Timothy E Allen¹, Markus J Herrgård, Mingzhu Liu, Yu Qiu, Jeremy D Glasner, Frederick R Blattner, Bernhard Ø Palsson

Affiliation

¹ Department of Bioengineering, University of California-San Diego, La Jolla, California 92093-0412, USA.

Abstract

The recent availability of heterogeneous high-throughput data types has increased the need for scalable in silico methods with which to integrate data related to the processes of regulation, protein synthesis, and metabolism. A sequence-based framework for modeling transcription and translation in prokaryotes has been established and has been extended to study the expression state of the entire Escherichia coli genome. The resulting in silico analysis of the expression state highlighted three facets of gene expression in E. coli: (i) the metabolic resources required for genome expression and protein synthesis were found to be relatively invariant under the conditions tested; (ii) effective promoter strengths were estimated at the genome scale by using global mRNA abundance and half-life data, revealing genes subject to regulation under the experimental conditions tested; and (iii) large-scale genome location-dependent expression patterns with approximately 600-kb periodicity were detected in the E. coli genome based on the 49 expression data sets analyzed. These results support the notion that a structured model-driven analysis of expression data yields additional information that can be subjected to commonly used statistical analyses. The integration of heterogeneous genome-scale data (i.e., sequence, expression data, and mRNA half-life data) is readily achieved in the context of an in silico model.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Bacterial Proteins / biosynthesis
Bacterial Proteins / genetics
Data Interpretation, Statistical
Escherichia coli / genetics*
Escherichia coli / metabolism
Gene Expression Profiling* / statistics & numerical data
Gene Expression Regulation, Bacterial
Genome, Bacterial*
Models, Biological*
Oligonucleotide Array Sequence Analysis* / statistics & numerical data

Substances

Bacterial Proteins

Abstract

Publication types

MeSH terms

Substances

Grants and funding