An interpretable bimodal neural network characterizes the sequence and preexisting chromatin predictors of induced transcription factor binding

Genome Biol. 2021 Jan 7;22(1):20. doi: 10.1186/s13059-020-02218-6.

Abstract

Background: Transcription factor (TF) binding specificity is determined via a complex interplay between the transcription factor's DNA binding preference and cell type-specific chromatin environments. The chromatin features that correlate with transcription factor binding in a given cell type have been well characterized. For instance, the binding sites for a majority of transcription factors display concurrent chromatin accessibility. However, concurrent chromatin features reflect the binding activities of the transcription factor itself and thus provide limited insight into how genome-wide TF-DNA binding patterns became established in the first place. To understand the determinants of transcription factor binding specificity, we therefore need to examine how newly activated transcription factors interact with sequence and preexisting chromatin landscapes.

Results: Here, we investigate the sequence and preexisting chromatin predictors of TF-DNA binding by examining the genome-wide occupancy of transcription factors that have been induced in well-characterized chromatin environments. We develop Bichrom, a bimodal neural network that jointly models sequence and preexisting chromatin data to interpret the genome-wide binding patterns of induced transcription factors. We find that the preexisting chromatin landscape is a differential global predictor of TF-DNA binding; incorporating preexisting chromatin features improves our ability to explain the binding specificity of some transcription factors substantially, but not others. Furthermore, by analyzing site-level predictors, we show that transcription factor binding in previously inaccessible chromatin tends to correspond to the presence of more favorable cognate DNA sequences.

Conclusions: Bichrom thus provides a framework for modeling, interpreting, and visualizing the joint sequence and chromatin landscapes that determine TF-DNA binding dynamics.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Basic Helix-Loop-Helix Transcription Factors / metabolism
  • Binding Sites / genetics
  • Chromatin*
  • DNA-Binding Proteins / metabolism
  • Gene Expression Regulation
  • Genome
  • Histones / metabolism
  • Humans
  • Neural Networks, Computer*
  • Protein Binding / genetics*
  • Transcription Factors / metabolism*

Substances

  • ASCL1 protein, human
  • Basic Helix-Loop-Helix Transcription Factors
  • Chromatin
  • DNA-Binding Proteins
  • Histones
  • Transcription Factors