A predictive modeling approach for cell line-specific long-range regulatory interactions

Nucleic Acids Res. 2015 Oct 15;43(18):8694-712. doi: 10.1093/nar/gkv865. Epub 2015 Sep 3.

Abstract

Long range regulatory interactions among distal enhancers and target genes are important for tissue-specific gene expression. Genome-scale identification of these interactions in a cell line-specific manner, especially using the fewest possible datasets, is a significant challenge. We develop a novel computational approach, Regulatory Interaction Prediction for Promoters and Long-range Enhancers (RIPPLE), that integrates published Chromosome Conformation Capture (3C) data sets with a minimal set of regulatory genomic data sets to predict enhancer-promoter interactions in a cell line-specific manner. Our results suggest that CTCF, RAD21, a general transcription factor (TBP) and activating chromatin marks are important determinants of enhancer-promoter interactions. To predict interactions in a new cell line and to generate genome-wide interaction maps, we develop an ensemble version of RIPPLE and apply it to generate interactions in five human cell lines. Computational validation of these predictions using existing ChIA-PET and Hi-C data sets showed that RIPPLE accurately predicts interactions among enhancers and promoters. Enhancer-promoter interactions tend to be organized into subnetworks representing coordinately regulated sets of genes that are enriched for specific biological processes and cis-regulatory elements. Overall, our work provides a systematic approach to predict and interpret enhancer-promoter interactions in a genome-wide cell-type specific manner using a few experimentally tractable measurements.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • CCCTC-Binding Factor
  • Cell Cycle Proteins / analysis
  • Cell Line
  • Chromatin / chemistry
  • Chromatin / metabolism
  • Chromosomal Proteins, Non-Histone / analysis
  • Enhancer Elements, Genetic*
  • Genomics / methods*
  • Histone Code
  • Humans
  • Models, Genetic*
  • Promoter Regions, Genetic*
  • Repressor Proteins / analysis
  • TATA-Box Binding Protein / analysis

Substances

  • CCCTC-Binding Factor
  • CTCF protein, human
  • Cell Cycle Proteins
  • Chromatin
  • Chromosomal Proteins, Non-Histone
  • Repressor Proteins
  • TATA-Box Binding Protein
  • cohesins