The analysis of stress-induced duplex destabilization in long genomic DNA sequences

J Comput Biol. 2004;11(4):519-43. doi: 10.1089/cmb.2004.11.519.


We present a method for calculating predicted locations and extents of stress-induced DNA duplex destabilization (SIDD) as functions of base sequence and stress level in long DNA molecules. The base pair denaturation energies are assigned individually, so the influences of near neighbors, methylated bases, adducts, or lesions can be included. Sample calculations indicate that copolymeric energetics give results that are close to those derived when full near-neighbor energetics are used; small but potentially informative differences occur only in the calculated SIDD properties of moderately destabilized regions. The method presented here for analyzing long sequences calculates the destabilization properties within windows of fixed length N, with successive windows displaced by an offset distance d(o). The final values of the relevant destabilization parameters for each base pair are calculated as weighted averages of the values computed for each window in which that base pair appears. This approach implicitly assumes that the strength of the direct coupling between remote base pairs that is induced by the imposed stress attenuates with their separation distance. This strategy enables calculations of the destabilization properties of DNA sequences of any length, up to and including complete chromosomes. We illustrate its utility by calculating the destabilization properties of the entire E. coli genomic DNA sequence. A preliminary analysis of the results shows that promoters are associated with SIDD regions in a highly statistically significant manner, suggesting that SIDD attributes may prove useful in the computational prediction of promoter locations in prokaryotes.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Base Pairing
  • Biomechanical Phenomena
  • Computational Biology
  • DNA / chemistry*
  • DNA / genetics*
  • DNA, Bacterial / chemistry
  • DNA, Bacterial / genetics
  • DNA, Superhelical / chemistry
  • DNA, Superhelical / genetics
  • Drug Stability
  • Escherichia coli / genetics
  • Genome, Bacterial
  • Genomics / methods
  • Genomics / statistics & numerical data
  • Models, Biological
  • Nucleic Acid Conformation*
  • Nucleic Acid Denaturation
  • Sequence Analysis, DNA / methods*
  • Sequence Analysis, DNA / statistics & numerical data
  • Thermodynamics


  • DNA, Bacterial
  • DNA, Superhelical
  • DNA