DeepChrome: deep-learning for predicting gene expression from histone modifications

Bioinformatics. 2016 Sep 1;32(17):i639-i648. doi: 10.1093/bioinformatics/btw427.

Abstract

Motivation: Histone modifications are among the most important factors that control gene regulation. Computational methods that predict gene expression from histone modification signals are highly desirable for understanding their combinatorial effects in gene regulation. This knowledge can help in developing 'epigenetic drugs' for diseases like cancer. Previous studies for quantifying the relationship between histone modifications and gene expression levels either failed to capture combinatorial effects or relied on multiple methods that separate predictions and combinatorial analysis. This paper develops a unified discriminative framework using a deep convolutional neural network to classify gene expression using histone modification data as input. Our system, called DeepChrome, allows automatic extraction of complex interactions among important features. To simultaneously visualize the combinatorial interactions among histone modifications, we propose a novel optimization-based technique that generates feature pattern maps from the learnt deep model. This provides an intuitive description of underlying epigenetic mechanisms that regulate genes.

Results: We show that DeepChrome outperforms state-of-the-art models like Support Vector Machines and Random Forests for gene expression classification task on 56 different cell-types from REMC database. The output of our visualization technique not only validates the previous observations but also allows novel insights about combinatorial interactions among histone modification marks, some of which have recently been observed by experimental studies.

Availability and implementation: Codes and results are available at www.deepchrome.org

Contact: yanjun@virginia.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Cluster Analysis
  • Computational Biology
  • Epigenesis, Genetic
  • Gene Expression Regulation*
  • Gene Regulatory Networks
  • Histone Code*
  • Humans
  • Neural Networks, Computer
  • Support Vector Machine*