Reading of DNA sequence logos: prediction of major groove binding by information theory

Methods Enzymol. 1996;274:445-55. doi: 10.1016/s0076-6879(96)74036-3.

Abstract

DNA sequences to which the OxyR protein binds under oxidizing conditions were analyzed by the sequence logo method, a quantitative graphic technique based on information theory. A sequence logo shows both the sequence conservation and the frequencies of bases at each position in a site. Unlike the consensus sequence, the sequence logo analysis revealed that OxyR should bind to four major grooves of DNA. This was later confirmed by experiments. Detailed interpretation of the sequence logo also allowed the prediction of likely major and minor groove OxyR-DNA base contacts, consistent with available experimental results. Because the sequence logo shows the original base frequencies in a clear, easily interpreted graphic that does not distort the data, highly refined analysis of binding site contacts becomes easy. Not only can these methods be applied to any DNA sequence binding site, they can also be applied to sites on RNA and proteins.

MeSH terms

  • Bacterial Proteins / metabolism
  • Base Composition
  • Base Sequence*
  • Binding Sites
  • DNA / chemistry*
  • DNA / metabolism*
  • DNA-Binding Proteins*
  • Databases, Factual
  • Escherichia coli / metabolism
  • Escherichia coli Proteins
  • Molecular Sequence Data
  • Nucleic Acid Conformation*
  • Repressor Proteins / metabolism*
  • Transcription Factors / metabolism*

Substances

  • Bacterial Proteins
  • DNA-Binding Proteins
  • Escherichia coli Proteins
  • Repressor Proteins
  • Transcription Factors
  • oxyR protein, E coli
  • DNA