Motivation: Completely sequenced genomes allow for detection and analysis of the relatively weak periodicities of 10-11 basepairs (bp). Two sources contribute to such signals: correlations in the corresponding protein sequences due to the amphipatic character of alpha-helices and the folding of DNA (nucleosomal patterns, DNA supercoiling). Since the topological state of genomic DNA is of importance for its replication, recombination and transcription, there is an immediate interest to obtain information about the supercoiled state from sequence periodicities.
Results: We show that correlations within proteins affect mainly the oscillations at distances below 35 bp. The long-ranging correlations up to 100 bp reflect primarily DNA folding. For the yeast genome these oscillations are consistent in detail with the chromatin structure. For eubacteria and archaea the periods deviate significantly from the 10.55 bp value for free DNA. These deviations suggest that while a period of 11 bp in bacteria reflects negative supercoiling, the significantly different period of thermophilic archaea close to 10 bp corresponds to positive supercoiling of thermophilic archaeal genomes.
Availability: Protein sets and C programs for the calculation of correlation functions are available on request from the authors (see http://itb.biologie.hu-berlin.de).