Sequence analysis of an 80 kb human neocentromere

Hum Mol Genet. 1999 Feb;8(2):217-27. doi: 10.1093/hmg/8.2.217.

Abstract

We previously described the cloning of an 80 kb DNA corresponding to the core protein-binding domain of a human chromosome 10-derived neocentromere. Here we report the complete sequence of this DNA (designated NC DNA) and its detailed structural analysis. The sequence is devoid of human centromeric alpha-satellite DNA and the pericentric beta- and gamma-satellites, the ATRS and 48 bp repeat DNA. One copy of a sequence that is related to the CENPB box motif is present, and a number of copies of other pericentric sequences including pJalpha and classical satellites I and III are present but both their relative sparsity and non-tandem organization suggest that each sequence, on its own, is unlikely to mimic any role the sequence may have in the normal centromere. The DNA-binding motifs of the architectural and regulatory proteins HMGI and topoII have a normal abundance and random distribution, implying that these sequences are not key functional elements. The total A + T content of the sequence is not notably different from that of the human genome, but an abundance of AT-rich islands and a biased distribution of these islands within the NC sequence are clearlydiscernible and may be functionally significant. Substantial amounts of transposable elements and low copy number tandem repeats, including several that are highly AT- and purine-rich, are also present and may act as functional elements. One of the AT-rich tandemrepeats (AT28) may form interesting structures and is described in detail. The defined features show only a loose resemblance to the structures of known centromeres, highlighting the possibility that, rather than a conserved primary sequence, it is the overallcomposition and distribution patterns of various unknown functional elements, or any 'ordinary' DNA under appropriate epigenetic influences, that determine centromere formation and function. This is the firstdetailed analysis of a neocentromere DNA and provides a basis for comparison against future sequences.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adenosine Triphosphate / genetics
  • Adenosine Triphosphate / metabolism
  • Autoantigens*
  • Base Sequence
  • Binding Sites / genetics
  • Centromere / genetics*
  • Centromere Protein B
  • Chromosomal Proteins, Non-Histone
  • Chromosomes, Human, Pair 10 / genetics
  • DNA / chemistry
  • DNA / genetics*
  • DNA Topoisomerases, Type II / genetics
  • DNA Topoisomerases, Type II / metabolism
  • DNA, Satellite
  • DNA-Binding Proteins / genetics
  • DNA-Binding Proteins / metabolism
  • Expressed Sequence Tags
  • HMGA1a Protein
  • High Mobility Group Proteins / genetics
  • High Mobility Group Proteins / metabolism
  • Humans
  • Microsatellite Repeats
  • Molecular Sequence Data
  • Sequence Analysis, DNA
  • Tandem Repeat Sequences
  • Thymine Nucleotides / genetics
  • Thymine Nucleotides / metabolism
  • Transcription Factors / genetics
  • Transcription Factors / metabolism

Substances

  • Autoantigens
  • CENPB protein, human
  • Centromere Protein B
  • Chromosomal Proteins, Non-Histone
  • DNA, Satellite
  • DNA-Binding Proteins
  • High Mobility Group Proteins
  • Thymine Nucleotides
  • Transcription Factors
  • HMGA1a Protein
  • Adenosine Triphosphate
  • DNA
  • DNA Topoisomerases, Type II
  • thymidine 5'-triphosphate