Sequence basis of transcription initiation in the human genome

Science. 2024 Apr 26;384(6694):eadj0116. doi: 10.1126/science.adj0116. Epub 2024 Apr 26.

Abstract

Transcription initiation is a process that is essential to ensuring the proper function of any gene, yet we still lack a unified understanding of sequence patterns and rules that explain most transcription start sites in the human genome. By predicting transcription initiation at base-pair resolution from sequences with a deep learning-inspired explainable model called Puffin, we show that a small set of simple rules can explain transcription initiation at most human promoters. We identify key sequence patterns that contribute to human promoter activity, each activating transcription with distinct position-specific effects. Furthermore, we explain the sequence basis of bidirectional transcription at promoters, identify the links between promoter sequence and gene expression variation across cell types, and explore the conservation of sequence determinants of transcription initiation across mammalian species.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Base Sequence
  • Deep Learning
  • Genome, Human*
  • Humans
  • Promoter Regions, Genetic*
  • Transcription Initiation Site*
  • Transcription Initiation, Genetic*