Integrative analysis of epigenetics data identifies gene-specific regulatory elements

Nucleic Acids Res. 2021 Oct 11;49(18):10397-10418. doi: 10.1093/nar/gkab798.

Abstract

Understanding how epigenetic variation in non-coding regions is involved in distal gene-expression regulation is an important problem. Regulatory regions can be associated to genes using large-scale datasets of epigenetic and expression data. However, for regions of complex epigenomic signals and enhancers that regulate many genes, it is difficult to understand these associations. We present StitchIt, an approach to dissect epigenetic variation in a gene-specific manner for the detection of regulatory elements (REMs) without relying on peak calls in individual samples. StitchIt segments epigenetic signal tracks over many samples to generate the location and the target genes of a REM simultaneously. We show that this approach leads to a more accurate and refined REM detection compared to standard methods even on heterogeneous datasets, which are challenging to model. Also, StitchIt REMs are highly enriched in experimentally determined chromatin interactions and expression quantitative trait loci. We validated several newly predicted REMs using CRISPR-Cas9 experiments, thereby demonstrating the reliability of StitchIt. StitchIt is able to dissect regulation in superenhancers and predicts thousands of putative REMs that go unnoticed using peak-based approaches suggesting that a large part of the regulome might be uncharted water.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromatin / metabolism*
  • Data Analysis*
  • Enhancer Elements, Genetic*
  • Epigenesis, Genetic*
  • Gene Expression Regulation*
  • Human Umbilical Vein Endothelial Cells
  • Humans

Substances

  • Chromatin