A self-attention model for inferring cooperativity between regulatory features

Nucleic Acids Res. 2021 Jul 21;49(13):e77. doi: 10.1093/nar/gkab349.

Abstract

Deep learning has demonstrated its predictive power in modeling complex biological phenomena such as gene expression. The value of these models hinges not only on their accuracy, but also on the ability to extract biologically relevant information from the trained models. While there has been much recent work on developing feature attribution methods that discover the most important features for a given sequence, inferring cooperativity between regulatory elements, which is the hallmark of phenomena such as gene expression, remains an open problem. We present SATORI, a Self-ATtentiOn based model to detect Regulatory element Interactions. Our approach combines convolutional layers with a self-attention mechanism that helps us capture a global view of the landscape of interactions between regulatory elements in a sequence. A comprehensive evaluation demonstrates the ability of SATORI to identify numerous statistically significant TF-TF interactions, many of which have been previously reported. Our method is able to detect higher numbers of experimentally verified TF-TF interactions than existing methods, and has the advantage of not requiring a computationally expensive post-processing step. Finally, SATORI can be used for detection of any type of feature interaction in models that use a similar attention mechanism, and is not limited to the detection of TF-TF interactions.

MeSH terms

  • Arabidopsis / genetics
  • Cell Line
  • Chromatin Immunoprecipitation Sequencing
  • Deep Learning*
  • Genomics / methods*
  • Humans
  • Nucleotide Motifs
  • Promoter Regions, Genetic
  • Regulatory Elements, Transcriptional*
  • Transcription Factors / metabolism*

Substances

  • Transcription Factors