Mapping Z-DNA in the human genome. Computer-aided mapping reveals a nonrandom distribution of potential Z-DNA-forming sequences in human genes

J Biol Chem. 1992 Jun 15;267(17):11846-55.

Abstract

In this work, we have predicted and mapped the potential Z-DNA-forming sequences in over one million base pairs of human DNA, containing 137 complete genes. The computer program (Z-Hunt-II) developed for this study uses a rigorous thermodynamic search strategy to map the occurrence of left-handed Z-DNA in genomic sequences. The search algorithm has been optimized to search large sequences for the potential occurrence of Z-DNA, taking into account sequence type, length, and cooperativity for a given stretch of potential Z-DNA-forming nucleotides. In this extensive data set we have identified 329 potential Z-DNA-forming sequences. The exact locations of the potential Z-DNA-forming sequences in the data set have been mapped with respect to the location of structural features of the genes. This analysis reveals a distinctly nonrandom distribution of potential Z-DNA-forming sequences across human genes and, most notably, that strong Z-DNA-forming sequences are more commonly found near the 5' ends of genes. We find that 35% of the Z-DNA-forming sequences are located upstream of the first expressed exon, while only 3% of the sequences are located downstream of the last expressed exon. The remaining 62% of the Z-DNA-forming sequences, which are located either in introns (47.1%) or exons (14.9%), are also nonrandomly distributed, with a strong bias toward locations near the site of transcription initiation. We interpret this distribution of potential Z-DNA-forming sequences toward the 5' end of human genes in terms of the well established "twin-domain model" of transcription-induced supercoiling and the effect of this topological strain on Z-DNA formation in eukaryotic cells.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Base Sequence
  • Chromosome Mapping
  • DNA / genetics*
  • Genome, Human*
  • Humans
  • Molecular Sequence Data
  • Software
  • Thermodynamics
  • Transcription, Genetic

Substances

  • DNA