Conserved and species-specific transcription factor co-binding patterns drive divergent gene regulation in human and mouse

Nucleic Acids Res. 2018 Feb 28;46(4):1878-1894. doi: 10.1093/nar/gky018.


The mouse is widely used as system to study human genetic mechanisms. However, extensive rewiring of transcriptional regulatory networks often confounds translation of findings between human and mouse. Site-specific gain and loss of individual transcription factor binding sites (TFBS) has caused functional divergence of orthologous regulatory loci, and so we must look beyond this positional conservation to understand common themes of regulatory control. Fortunately, transcription factor co-binding patterns shared across species often perform conserved regulatory functions. These can be compared to 'regulatory sentences' that retain the same meanings regardless of sequence and species context. By analyzing TFBS co-occupancy patterns observed in four human and mouse cell types, we learned a regulatory grammar: the rules by which TFBS are combined into meaningful regulatory sentences. Different parts of this grammar associate with specific sets of functional annotations regardless of sequence conservation and predict functional signatures more accurately than positional conservation. We further show that both species-specific and conserved portions of this grammar are involved in gene expression divergence and human disease risk. These findings expand our understanding of transcriptional regulatory mechanisms, suggesting that phenotypic divergence and disease risk are driven by a complex interplay between deeply conserved and species-specific transcriptional regulatory pathways.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Base Sequence
  • Binding Sites
  • Chromatin
  • Conserved Sequence
  • Disease / genetics
  • Evolution, Molecular
  • Gene Expression Regulation*
  • Genetic Loci
  • Humans
  • Immune System
  • Mice / genetics*
  • Polymorphism, Single Nucleotide
  • Species Specificity
  • Transcription Factors / metabolism*


  • Chromatin
  • Transcription Factors