Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Sep 19;114(38):10286-10291.
doi: 10.1073/pnas.1702581114. Epub 2017 Sep 5.

Global transcriptional regulatory network for Escherichia coli robustly connects gene expression to transcription factor activities

Affiliations

Global transcriptional regulatory network for Escherichia coli robustly connects gene expression to transcription factor activities

Xin Fang et al. Proc Natl Acad Sci U S A. .

Abstract

Transcriptional regulatory networks (TRNs) have been studied intensely for >25 y. Yet, even for the Escherichia coli TRN-probably the best characterized TRN-several questions remain. Here, we address three questions: (i) How complete is our knowledge of the E. coli TRN; (ii) how well can we predict gene expression using this TRN; and (iii) how robust is our understanding of the TRN? First, we reconstructed a high-confidence TRN (hiTRN) consisting of 147 transcription factors (TFs) regulating 1,538 transcription units (TUs) encoding 1,764 genes. The 3,797 high-confidence regulatory interactions were collected from published, validated chromatin immunoprecipitation (ChIP) data and RegulonDB. For 21 different TF knockouts, up to 63% of the differentially expressed genes in the hiTRN were traced to the knocked-out TF through regulatory cascades. Second, we trained supervised machine learning algorithms to predict the expression of 1,364 TUs given TF activities using 441 samples. The algorithms accurately predicted condition-specific expression for 86% (1,174 of 1,364) of the TUs, while 193 TUs (14%) were predicted better than random TRNs. Third, we identified 10 regulatory modules whose definitions were robust against changes to the TRN or expression compendium. Using surrogate variable analysis, we also identified three unmodeled factors that systematically influenced gene expression. Our computational workflow comprehensively characterizes the predictive capabilities and systems-level functions of an organism's TRN from disparate data types.

Keywords: matrix factorization; regression; transcriptional regulation; transcriptomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Overview of our workflow. (A) RegulonDB (7) and additional published ChIP data were combined to reconstruct the hiTRN. (B) Using our hiTRN, we analyzed transcriptome shifts in expression compendia [EcoMAC (14), E. coli Expression 2 (15), and COLOMBOS (16)]. (C) We evaluated the completeness of our knowledge of our hiTRN on the basis of network consistency, dimensionality reduction, and detection of stable regulatory modules. (D) We assessed the ability of our hiTRN to quantitatively predict gene expression using MI and regression.
Fig. 2.
Fig. 2.
Consistency of hiTRN with observed differential gene expression in wild-type cells and in strains with a deleted TF gene. (A) Consistency of our TRN with observed differential and nondifferential gene expression accounting for regulatory bias (i.e., sign consistency). (B) Reachability (existence of contiguous regulatory paths in the TRN) from deleted TFs to DEGs in the TRN.
Fig. 3.
Fig. 3.
Regulon enrichment and functions of metagenes. Colors indicate functions of metagenes based on enriched regulons. Functionally related regulons are enriched in the same metagenes. Note that only TFs in modules 1, 3, and 5 are shown here. A full heatmap can be found in SI Appendix, Fig. S6.
Fig. 4.
Fig. 4.
Ten functional regulatory modules for 147 TFs. Size of rectangles are proportional to the size of regulons (i.e., the number of regulated genes). The overlap between regulons is not shown. Modules are fully defined in Datasets S2 and S3.
Fig. 5.
Fig. 5.
Accuracy of expression predictions on training and held-out testing transcription units. (A) R2 (coefficient of determination) of predicted expression profile vs. true expression profile using various regression models. (B) R2 value of the testing dataset predicted by a Gaussian kernel SVR, grouped by number of known TFs. Error bars indicate SD for groups with >3 observations.

Similar articles

Cited by

References

    1. Martínez-Antonio A, Janga SC, Thieffry D. Functional organisation of Escherichia coli transcriptional regulatory network. J Mol Biol. 2008;381:238–247. - PMC - PubMed
    1. Covert MW, Knight EM, Reed JL, Herrgard MJ, Palsson BO. Integrating high-throughput and computational data elucidates bacterial networks. Nature. 2004;429:92–96. - PubMed
    1. Chandrasekaran S, Price ND. Probabilistic integrative modeling of genome-scale metabolic and regulatory networks in Escherichia coli and Mycobacterium tuberculosis. Proc Natl Acad Sci USA. 2010;107:17845–17850. - PMC - PubMed
    1. Rustad TR, et al. Mapping and manipulating the Mycobacterium tuberculosis transcriptome using a transcription factor overexpression-derived regulatory network. Genome Biol. 2014;15:502. - PMC - PubMed
    1. Kochanowski K, et al. Few regulatory metabolites coordinate expression of central metabolic genes in Escherichia coli. Mol Syst Biol. 2017;13:903. - PMC - PubMed

Publication types

MeSH terms

Substances