CpG doublets, CpG islands and Alu repeats in long human DNA sequences from different isochore families

Gene. 1998 Dec 11;224(1-2):123-7. doi: 10.1016/s0378-1119(98)00474-0.

Abstract

A computer analysis of 946 human DNA sequences larger than 50kb and representing about 118Mb of DNA has led to the following observations. (i) Positive correlations hold between CpG levels and the GC levels of isochores and coding sequences, as expected from previous results. (ii) The correlation between CpG levels and the GC levels of pseudogenes is characterized by lower CpG values (at comparable GC levels) and by a lower slope compared with the correlation with coding sequences; this finding suggests that an extensive methylation followed by deamination has taken place on CpG doublets from inactive genes leading to a further CpG shortage. (iii) The frequency of CpG islands in long human sequences increases with increasing GC and almost parallels gene frequency. (iv) The frequency of Alu sequences also increases with increasing GC, but attains a maximum in H2 isochores, in agreement with previous experimental data. (v) The ratio 5mC/CpG (namely, the methylation level over available sites) decreases with increasing GC levels of isochores. This decrease is due only to a small extent to the increase of (unmethylated) CpG islands in GC-rich isochores, and takes place in spite of the increase of strongly methylated Alu sequences in GC-rich isochores; this stresses the much lower relative methylation (5mC/CpG) of single-copy sequences located in GC-rich isochores relative to those located in GC-poor isochores. (vi) CpG levels of Alus and CpG islands are positively correlated with the GC levels of the long sequences in which they are located. (vii) The CpG levels of both Alus and CpG islands increase with their GC levels.

MeSH terms

  • 5-Methylcytosine
  • Alu Elements / genetics*
  • Amino Acid Sequence
  • Codon
  • CpG Islands / genetics*
  • Cytosine / analogs & derivatives
  • DNA / genetics*
  • Databases, Factual
  • Guanine
  • Humans
  • Pseudogenes
  • Statistics as Topic

Substances

  • Codon
  • Guanine
  • 5-Methylcytosine
  • Cytosine
  • DNA