Core promoter STRs: novel mechanism for inter-individual variation in gene expression in humans

Gene. 2012 Jan 15;492(1):195-8. doi: 10.1016/j.gene.2011.10.028. Epub 2011 Oct 21.

Abstract

In a genome-scale analysis of the composition of core promoter sequences, we have recently shown that approximately 25% of the human protein-coding genes have at least one short tandem repeat (STR) of 3-repeats in their core promoters (i.e. the interval between -120 to +1). Through their nucleosome processing effect, GA-repeats play a crucial role in the regulation of gene transcription. In this study, we chose the human SRY (sex determining region Y)-box 5 (SOX5) gene as a prototype of the GA-rich core promoters to investigate the role of core promoter GA-STRs in gene expression. The human SOX5 gene is indispensable for diverse embryonic developmental processes, ranging from oligodendrocyte development and corticogenesis to chondrogenesis, and regulation of the cell cycle. Whereas the absolute ratio of 99% of the genes range between 0.2 and 2, the composition of the core promoter of the two most ubiquitously expressed mRNAs of the human SOX5 gene (transcripts ID: ENST00000451604 and ENST00000309359) is exceptionally rich in purine nucleotides (purine/pyrimidine ratio: 61.5). Indeed, this core promoter is an island of four tandem GA-STRs, and lacks the known TATA and TATA-less elements for gene transcription. Evolutionary conservation of this region between human and mouse (75% homology) supports important functional role for this promoter. In this study, we show that this nucleotide composition is indeed a potent promoter (p<1×10(-10)), and different haplotypes across the region result in significant difference in gene expression (p<1×10(-6)). To our knowledge, this is the first report of functional STRs in a human gene core promoter. Based on our search on the core promoters of the entire human protein-coding genes annotated in the GeneCards database (19,927genes) for the presence of pure GA-STRs, 429 genes contain at least one GA(3)-repeat in their core promoter. Core promoters with pure GA-STRs of GA(4) and above were observed in 61 genes. Our data unravel a novel mechanism for inter-individual variation in gene expression and complex traits/phenotypes through core promoter GA-STRs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Composition
  • Gene Expression Regulation*
  • Genetic Variation
  • Haplotypes
  • Humans
  • Microsatellite Repeats*
  • Promoter Regions, Genetic*
  • SOXD Transcription Factors / genetics*

Substances

  • SOX5 protein, human
  • SOXD Transcription Factors