Information theory based T7-like promoter models: classification of bacteriophages and differential evolution of promoters and their polymerases

Nucleic Acids Res. 2005 Oct 31;33(19):6172-87. doi: 10.1093/nar/gki915. Print 2005.


Molecular information theory was used to create sequence logos and promoter models for eight phages of the T7 group: T7, phiA1122, T3, phiYeO3-12, SP6, K1-5, gh-1 and K11. When these models were used to scan the corresponding genomes, a significant gap in the individual information distribution was observed between functional promoter sites and other sequences, suggesting that the models can be used to identify new T7-like promoters. When a combined 76-site model was used to scan the eight phages, 108 of the total 109 promoters were found, while none were found for other T7-like phages, phiKMV, P60, VpV262, SIO1, PaP3, Xp10, P-SSP7 and Ppu40, indicating that these phages do not belong to the T7 group. We propose that the T7-like transcription system, which consists of a phage-specific RNA polymerase and a set of conserved T7-like promoters, is a hallmark feature of the T7 group and can be used to classify T7-like phages. Phylogenetic trees of the T7-like promoter models and their corresponding RNA polymerases are similar, suggesting that the eight phages of the T7 group can be classified into five subgroups. However the SP6-like polymerases have apparently diverged from other polymerases more than their promoters have diverged from other promoters.

Publication types

  • Research Support, N.I.H., Intramural

MeSH terms

  • DNA-Directed RNA Polymerases / classification*
  • DNA-Directed RNA Polymerases / genetics
  • Evolution, Molecular*
  • Genome, Viral
  • Genomics
  • Information Theory
  • Models, Genetic
  • Phylogeny*
  • Podoviridae / classification*
  • Podoviridae / enzymology
  • Podoviridae / genetics*
  • Promoter Regions, Genetic*
  • Viral Proteins / classification*
  • Viral Proteins / genetics


  • Viral Proteins
  • DNA-Directed RNA Polymerases