A High-Throughput Screen for Transcription Activation Domains Reveals Their Sequence Features and Permits Prediction by Deep Learning

Mol Cell. 2020 Jun 4;78(5):890-902.e6. doi: 10.1016/j.molcel.2020.04.020. Epub 2020 May 15.

Abstract

Acidic transcription activation domains (ADs) are encoded by a wide range of seemingly unrelated amino acid sequences, making it difficult to recognize features that promote their dynamic behavior, "fuzzy" interactions, and target specificity. We screened a large set of random 30-mer peptides for AD function in yeast and trained a deep neural network (ADpred) on the AD-positive and -negative sequences. ADpred identifies known acidic ADs within transcription factors and accurately predicts the consequences of mutations. Our work reveals that strong acidic ADs contain multiple clusters of hydrophobic residues near acidic side chains, explaining why ADs often have a biased amino acid composition. ADs likely use a binding mechanism similar to avidity where a minimum number of weak dynamic interactions are required between activator and target to generate biologically relevant affinity and in vivo function. This mechanism explains the basis for fuzzy binding observed between acidic ADs and targets.

Keywords: activator; allovalency; avidity; coactivator; deep learning; enhancer; intrinsically disordered protein; machine learning; transcription activation; transcriptional regulation.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence / genetics
  • Basic-Leucine Zipper Transcription Factors / genetics
  • DNA-Binding Proteins / metabolism
  • Deep Learning
  • High-Throughput Screening Assays / methods*
  • Protein Binding
  • Protein Domains / genetics
  • Saccharomyces cerevisiae / genetics
  • Saccharomyces cerevisiae Proteins / genetics
  • Saccharomyces cerevisiae Proteins / metabolism
  • Trans-Activators / genetics
  • Trans-Activators / metabolism
  • Transcription Factors / genetics*
  • Transcription Factors / metabolism
  • Transcriptional Activation / genetics*
  • Transcriptional Activation / physiology

Substances

  • Basic-Leucine Zipper Transcription Factors
  • DNA-Binding Proteins
  • Saccharomyces cerevisiae Proteins
  • Trans-Activators
  • Transcription Factors