Genomic DNA from animals shows contrasting strand bias in large and small subsequences

BMC Genomics. 2008 Jan 25:9:43. doi: 10.1186/1471-2164-9-43.

Abstract

Background: For eukaryotes, there is almost no strand bias with regard to base composition, with exceptions for origins of replication and transcription start sites and transcribed regions. This paper revisits the question for subsequences of DNA taken at random from the genome.

Results: For a typical mammal, for example mouse or human, there is a small strand bias throughout the genomic DNA: there is a correlation between (G - C) and (A - T) on the same strand, (that is between the difference in the number of guanine and cytosine bases and the difference in the number of adenine and thymine bases). For small subsequences - up to 1 kb - this correlation is weak but positive; but for large windows - around 50 kb to 2 Mb - the correlation is strong and negative. This effect is largely independent of GC%. Transcribed and untranscribed regions give similar correlations both for small and large subsequences, but there is a difference in these regions for intermediate sized subsequences. An analysis of the human genome showed that position within the isochore structure did not affect these correlations. An analysis of available genomes of different species shows that this contrast between large and small windows is a general feature of mammals and birds. Further down the evolutionary tree, other organisms show a similar but smaller effect. Except for the nematode, all the animals analysed showed at least a small effect.

Conclusion: The correlations on the large scale may be explained by DNA replication. Transcription may be a modifier of these effects but is not the fundamental cause. These results cast light on how DNA mutations affect the genome over evolutionary time. At least for vertebrates, there is a broad relationship between body temperature and the size of the correlation. The genome of mammals and birds has a structure marked by strand bias segments.

MeSH terms

  • Animals
  • Base Composition / genetics*
  • Base Sequence
  • Birds / genetics
  • DNA / chemistry
  • DNA / genetics*
  • DNA Replication
  • Genome / genetics*
  • Humans
  • Isochores / genetics
  • Mammals / genetics
  • Transcription, Genetic

Substances

  • Isochores
  • DNA