SEGCOND predicts putative transcriptional condensate-associated genomic regions by integrating multi-omics data

Bioinformatics. 2023 Jan 1;39(1):btac742. doi: 10.1093/bioinformatics/btac742.

Abstract

Motivation: The compartmentalization of biochemical reactions, involved in the activation of gene expression in the eukaryotic nucleus, leads to the formation of membraneless bodies through liquid-liquid phase separation. These formations, called transcriptional condensates, appear to play important roles in gene regulation as they are assembled through the association of multiple enhancer regions in 3D genomic space. To date, we are still lacking efficient computational methodologies to identify the regions responsible for the formation of such condensates, based on genomic and conformational data.

Results: In this work, we present SEGCOND, a computational framework aiming to highlight genomic regions involved in the formation of transcriptional condensates. SEGCOND is flexible in combining multiple genomic datasets related to enhancer activity and chromatin accessibility, to perform a genome segmentation. It then uses this segmentation for the detection of highly transcriptionally active regions of the genome. At a final step, and through the integration of Hi-C data, it identifies regions of putative transcriptional condensates (PTCs) as genomic domains where multiple enhancer elements coalesce in 3D space. SEGCOND identifies a subset of enhancer segments with increased transcriptional activity. PTCs are also found to significantly overlap highly interconnected enhancer elements and super enhancers obtained through two independent approaches. Application of SEGCOND on data from a well-defined system of B-cell to macrophage transdifferentiation leads to the identification of previously unreported genes with a likely role in the process.

Availability and implementation: Source code and details for the implementation of SEGCOND is available at https://github.com/AntonisK95/SEGCOND.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromatin / genetics
  • Enhancer Elements, Genetic*
  • Genomics / methods
  • Multiomics*
  • Nuclear Bodies

Substances

  • Chromatin