DNA helix: the importance of being AT-rich

Mamm Genome. 2017 Oct;28(9-10):455-464. doi: 10.1007/s00335-017-9713-8. Epub 2017 Aug 23.

Abstract

The AT-rich DNA is mostly associated with condensed chromatin, whereas the GC-rich sequence is preferably located in the dispersed chromatin. The AT-rich genes are prone to be tissue-specific (silenced in most tissues), while the GC-rich genes tend to be housekeeping (expressed in many tissues). This paper reports another important property of DNA base composition, which can affect repertoire of genes with high AT content. The GC-rich sequence is more liable to mutation. We found that Spearman correlation between human gene GC content and mutation probability is above 0.9. The change of base composition even in synonymous sites affects mutation probability of nonsynonymous sites and thus of encoded proteins. There is a unique type of housekeeping genes, which are especially unsafe when prone to mutation. Natural selection which usually removes deleterious mutations, in the case of these genes only increases the hazard because it can descend to suborganismal (cellular) level. These are cell cycle-related genes. In accordance with the proposed concept, they have low GC content of synonymous sites (despite them being housekeeping). The gene-centred protein interaction enrichment analysis (PIEA) showed the core clusters of genes whose interactants are modularly enriched in genes with AT-rich synonymous codons. This interconnected network is involved in double-strand break repair, DNA integrity checkpoints and chromosome pairing at mitosis. The damage of these genes results in genome and chromosome instability leading to cancer and other 'error catastrophes'. Reducing the nonsynonymous mutations, the usage of AT-rich synonymous codons can decrease probability of cancer by above 20-fold.

MeSH terms

  • AT Rich Sequence / genetics*
  • Animals
  • Base Composition / genetics*
  • Cell Cycle / genetics
  • Codon
  • DNA / chemistry*
  • DNA / genetics*
  • DNA / physiology
  • Databases, Genetic
  • Evolution, Molecular*
  • GC Rich Sequence / genetics
  • Genome, Human / genetics
  • Genome, Human / physiology
  • Humans
  • Models, Genetic
  • Mutation / genetics*
  • Proteins / genetics
  • Proteins / physiology
  • Selection, Genetic / genetics

Substances

  • Codon
  • Proteins
  • DNA