Sigma70 promoters in Escherichia coli: specific transcription in dense regions of overlapping promoter-like signals

J Mol Biol. 2003 Oct 17;333(2):261-78. doi: 10.1016/j.jmb.2003.07.017.


We present here a computational analysis showing that sigma70 house-keeping promoters are located within zones with high densities of promoter-like signals in Escherichia coli, and we introduce strategies that allow for the correct computer prediction of sigma70 promoters. Based on 599 experimentally verified promoters of E.coli K-12, we generated and evaluated more than 200 weight matrices optimizing different criteria to obtain the best recognition matrices. The alignments generating the best statistical models did not fully correspond with the canonical sigma70 model. However, matrices that correspond to such a canonical model performed better as tools for prediction. We tested the predictive capacity of these matrices on 250 bp long regions upstream of gene starts, where 90% of the known promoters occur. The computational matrix models generated an average of 38 promoter-like signals within each 250 bp region. In more than 50% of the cases, the true promoter does not have the best score within the region. We observed, in fact, that real promoters occur mostly within regions with high densities of overlapping putative promoters. We evaluated several strategies to identify promoters. The best one uses an intrinsic score of the -10 and -35 hexamers that form the promoter as well as an extrinsic score that uses the distribution of promoters from the start of the gene. We were able to identify 86% true promoters correctly, generating an average of 4.7 putative promoters per region as output, of which 3.7, on average, exist in clusters, as a series of overlapping potentially competing RNA polymerase-binding sites. As far as we know, this is the highest predictive capability reported so far. This high signal density is found mainly within regions upstream of genes, contrasting with coding regions and regions located between convergently transcribed genes. These results are consistent with experimental evidence that show the existence of multiple overlapping promoter sites that become functional under particular conditions. This density is probably the consequence of a rich number of vestiges of promoters in evolution. We suggest that transcriptional regulators as well as other functional promoters play an important role in keeping these latent signals suppressed.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Bacterial Proteins / genetics*
  • Bacterial Proteins / metabolism
  • Conserved Sequence
  • DNA-Directed RNA Polymerases / genetics*
  • DNA-Directed RNA Polymerases / metabolism
  • Escherichia coli / enzymology*
  • Gene Expression Regulation, Bacterial
  • Genes, Bacterial
  • Genes, Overlapping
  • Genetic Variation
  • Promoter Regions, Genetic*
  • Sigma Factor / genetics*
  • Sigma Factor / metabolism
  • Transcription, Genetic*


  • Bacterial Proteins
  • Sigma Factor
  • RNA polymerase sigma 70
  • DNA-Directed RNA Polymerases