G+C content dominates intrinsic nucleosome occupancy

BMC Bioinformatics. 2009 Dec 22:10:442. doi: 10.1186/1471-2105-10-442.

Abstract

Background: The relative preference of nucleosomes to form on individual DNA sequences plays a major role in genome packaging. A wide variety of DNA sequence features are believed to influence nucleosome formation, including periodic dinucleotide signals, poly-A stretches and other short motifs, and sequence properties that influence DNA structure, including base content. It was recently shown by Kaplan et al. that a probabilistic model using composition of all 5-mers within a nucleosome-sized tiling window accurately predicts intrinsic nucleosome occupancy across an entire genome in vitro. However, the model is complicated, and it is not clear which specific DNA sequence properties are most important for intrinsic nucleosome-forming preferences.

Results: We find that a simple linear combination of only 14 simple DNA sequence attributes (G+C content, two transformations of dinucleotide composition, and the frequency of eleven 4-bp sequences) explains nucleosome occupancy in vitro and in vivo in a manner comparable to the Kaplan model. G+C content and frequency of AAAA are the most important features. G+C content is dominant, alone explaining approximately 50% of the variation in nucleosome occupancy in vitro.

Conclusions: Our findings provide a dramatically simplified means to predict and understand intrinsic nucleosome occupancy. G+C content may dominate because it both reduces frequency of poly-A-like stretches and correlates with many other DNA structural characteristics. Since G+C content is enriched or depleted at many types of features in diverse eukaryotic genomes, our results suggest that variation in nucleotide composition may have a widespread and direct influence on chromatin structure.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Composition / genetics*
  • Base Sequence
  • Computational Biology / methods*
  • Databases, Genetic
  • Genome
  • Nucleosomes*

Substances

  • Nucleosomes