Heterogeneity of genomes: measures and values

Proc Natl Acad Sci U S A. 1994 Dec 20;91(26):12837-41. doi: 10.1073/pnas.91.26.12837.

Abstract

Genomic homogeneity is investigated for a broad base of DNA sequences in terms of dinucleotide relative abundance distances (abbreviated delta-distances) and of oligonucleotide compositional extremes. It is shown that delta-distances between different genomic sequences in the same species are low, only about 2 or 3 times the distance found in random DNA, and are generally smaller than the between-species delta-distances. Extremes in short oligonucleotides include underrepresentation of TpA and overrepresentation of GpC in most temperate bacteriophage sequences; underrepresentation of CTAG in most eubacterial genomes; underrepresentation of GATC in most bacteriophage; CpG suppression in vertebrates, in all animal mitochondrial genomes, and in many thermophilic bacterial sequences; and overrepresentation of GpG/CpC in all animal mitochondrial sets and chloroplast genomes. Interpretations center on DNA structures (dinucleotide stacking energies, DNA curvature and superhelicity, nucleosome organization), context-dependent mutational events, methylation effects, and processes of replication and repair.

Publication types

  • Comparative Study
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Animals
  • Base Composition
  • DNA / chemistry*
  • DNA, Bacterial / chemistry
  • DNA, Fungal / chemistry
  • Eukaryotic Cells
  • Nucleic Acid Conformation
  • Oligodeoxyribonucleotides / chemistry
  • Sequence Analysis, DNA / methods*

Substances

  • DNA, Bacterial
  • DNA, Fungal
  • Oligodeoxyribonucleotides
  • DNA