Spatial patterns of CTCF sites define the anatomy of TADs and their boundaries

Genome Biol. 2020 Aug 12;21(1):197. doi: 10.1186/s13059-020-02108-x.

Abstract

Background: Topologically associating domains (TADs) are genomic regions of self-interaction. Additionally, it is known that TAD boundaries are enriched in CTCF binding sites. In turn, CTCF sites are known to be asymmetric, whereby the convergent configuration of a pair of CTCF sites leads to the formation of a chromatin loop in vivo. However, to date, it has been unclear how to reconcile TAD structure with CTCF-based chromatin loops.

Results: We approach this problem by analysing CTCF binding site strengths and classifying clusters of CTCF sites along the genome on the basis of their relative orientation. Analysis of CTCF site orientation classes as a function of their spatial distribution along the human genome reveals that convergent CTCF site clusters are depleted while divergent CTCF clusters are enriched in the 5- to 100-kb range. We then analyse the distribution of CTCF binding sites as a function of TAD boundary conservation across seven primary human blood cell types. This reveals divergent CTCF site enrichment at TAD boundaries. Furthermore, convergent arrays of CTCF sites separate the left and right sections of TADs that harbour internal CTCF sites, resulting in unequal TAD 'halves'.

Conclusions: The orientation-based CTCF binding site cluster classification that we present reconciles TAD boundaries and CTCF site clusters in a mechanistically elegant fashion. This model suggests that the emergent structure of nuclear chromatin in the form of TADs relies on the obligate alternation of divergent and convergent CTCF site clusters that occur at different length scales along the genome.

Keywords: CTCF binding site clusters; CTCF orientation patterns; Chromatin architecture; Loop extrusion; TAD boundary conservation; TADs.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • CCCTC-Binding Factor / metabolism*
  • Genome, Human*
  • Humans

Substances

  • CCCTC-Binding Factor
  • CTCF protein, human