The Escherichia coli transcriptome mostly consists of independently regulated modules

Nat Commun. 2019 Dec 4;10(1):5536. doi: 10.1038/s41467-019-13483-w.


Underlying cellular responses is a transcriptional regulatory network (TRN) that modulates gene expression. A useful description of the TRN would decompose the transcriptome into targeted effects of individual transcriptional regulators. Here, we apply unsupervised machine learning to a diverse compendium of over 250 high-quality Escherichia coli RNA-seq datasets to identify 92 statistically independent signals that modulate the expression of specific gene sets. We show that 61 of these transcriptomic signals represent the effects of currently characterized transcriptional regulators. Condition-specific activation of signals is validated by exposure of E. coli to new environmental conditions. The resulting decomposition of the transcriptome provides: a mechanistic, systems-level, network-based explanation of responses to environmental and genetic perturbations; a guide to gene and regulator function discovery; and a basis for characterizing transcriptomic differences in multiple strains. Taken together, our results show that signal summation describes the composition of a model prokaryotic transcriptome.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Escherichia coli / genetics*
  • Escherichia coli Proteins / genetics*
  • Escherichia coli Proteins / metabolism
  • Gene Expression Profiling
  • Gene Expression Regulation, Bacterial*
  • Gene Regulatory Networks / genetics*
  • Models, Genetic
  • Signal Transduction / genetics
  • Transcription Factors / genetics
  • Transcription Factors / metabolism
  • Transcriptome / genetics*


  • Escherichia coli Proteins
  • Transcription Factors