Clustering of genes coding for DNA binding proteins in a region of atypical evolution of the human genome

J Mol Evol. 2004 Jul;59(1):72-9. doi: 10.1007/s00239-004-2605-z.


Comparison of the human and mouse genomes has revealed that significant variations in evolutionary rates exist among genomic regions and that a large part of this variation is interchromosomal. We confirm in this work, using a large collection of introns, that human chromosome 19 is the one that shows the highest divergence with respect to mouse. To search for other differences among chromosomes, we examine the distribution of gene functions in human and mouse chromosomes using the Gene Ontology definitions. We found by correspondence analysis that among the strongest clusterings of gene functions in human chromosomes is a group of genes coding for DNA binding proteins in chromosome 19. Interestingly, chromosome 19 also has a very high GC content, a feature that has been proposed to promote an opening of the chromatin, thereby facilitating binding of proteins to the DNA helix. In the mouse genome, however, a similar aggregation of genes coding for DNA binding proteins and high GC content cannot be found. This suggests that the distribution of genes coding for DNA binding proteins and the variations of the chromatin accessibility to these proteins are different in the human and mouse genomes. It is likely that the overall high synonymous and intron rates in chromosome 19 are a by-product of the high GC content of this chromosome.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Composition / genetics
  • Chromatin / metabolism
  • Chromosomes, Human, Pair 19 / genetics*
  • Computational Biology
  • DNA-Binding Proteins / genetics*
  • DNA-Binding Proteins / metabolism
  • Databases, Genetic
  • Evolution, Molecular*
  • Genome, Human*
  • Humans
  • Introns / genetics
  • Models, Genetic
  • Multigene Family / genetics*
  • Phylogeny
  • Zinc Fingers / genetics


  • Chromatin
  • DNA-Binding Proteins