Neural network modeling of differential binding between wild-type and mutant CTCF reveals putative binding preferences for zinc fingers 1-2

BMC Genomics. 2022 Apr 12;23(1):295. doi: 10.1186/s12864-022-08486-9.

Abstract

Background: Many transcription factors (TFs), such as multi zinc-finger (ZF) TFs, have multiple DNA binding domains (DBDs), and deciphering the DNA binding motifs of individual DBDs is a major challenge. One example of such a TF is CCCTC-binding factor (CTCF), a TF with eleven ZFs that plays a variety of roles in transcriptional regulation, most notably anchoring DNA loops. Previous studies found that CTCF ZFs 3-7 bind CTCF's core motif and ZFs 9-11 bind a specific upstream motif, but the motifs of ZFs 1-2 have yet to be identified.

Results: We developed a new approach to identifying the binding motifs of individual DBDs of a TF through analyzing chromatin immunoprecipitation sequencing (ChIP-seq) experiments in which a single DBD is mutated: we train a deep convolutional neural network to predict whether wild-type TF binding sites are preserved in the mutant TF dataset and interpret the model. We applied this approach to mouse CTCF ChIP-seq data and identified the known binding preferences of CTCF ZFs 3-11 as well as a putative GAG binding motif for ZF 1. We analyzed other CTCF datasets to provide additional evidence that ZF 1 is associated with binding at the motif we identified, and we found that the presence of the motif for ZF 1 is associated with CTCF ChIP-seq peak strength.

Conclusions: Our approach can be applied to any TF for which in vivo binding data from both the wild-type and mutated versions of the TF are available, and our findings provide new potential insights binding preferences of CTCF's DBDs.

Keywords: Binding strength; CTCF; Deep neural network; Motif; Mutated transcription factor; Zinc finger.

MeSH terms

  • Animals
  • Binding Sites
  • CCCTC-Binding Factor / metabolism
  • DNA / metabolism
  • Mice
  • Neural Networks, Computer
  • Protein Binding
  • Transcription Factors* / genetics
  • Transcription Factors* / metabolism
  • Zinc Fingers / genetics
  • Zinc* / metabolism

Substances

  • CCCTC-Binding Factor
  • Transcription Factors
  • DNA
  • Zinc