A short ORF-encoded transcriptional regulator

Proc Natl Acad Sci U S A. 2021 Jan 26;118(4):e2021943118. doi: 10.1073/pnas.2021943118.


Recent technological advances have expanded the annotated protein coding content of mammalian genomes, as hundreds of previously unidentified, short open reading frame (ORF)-encoded peptides (SEPs) have now been found to be translated. Although several studies have identified important physiological roles for this emerging protein class, a general method to define their interactomes is lacking. Here, we demonstrate that genetic incorporation of the photo-crosslinking noncanonical amino acid AbK into SEP transgenes allows for the facile identification of SEP cellular interaction partners using affinity-based methods. From a survey of seven SEPs, we report the discovery of short ORF-encoded histone binding protein (SEHBP), a conserved microprotein that interacts with chromatin-associated proteins, localizes to discrete genomic loci, and induces a robust transcriptional program when overexpressed in human cells. This work affords a straightforward method to help define the physiological roles of SEPs and demonstrates its utility by identifying SEHBP as a short ORF-encoded transcription factor.

Keywords: expanded genetic code; photo-crosslinking; short open reading frame-encoded peptide; transcriptional regulation.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Cattle
  • Chromatin / chemistry
  • Chromatin / metabolism
  • Diazomethane / analogs & derivatives
  • Diazomethane / metabolism*
  • Gene Expression Regulation
  • Genetic Loci
  • HEK293 Cells
  • HeLa Cells
  • Histones / genetics*
  • Histones / metabolism
  • Humans
  • K562 Cells
  • Lysine / analogs & derivatives
  • Lysine / metabolism*
  • Mice
  • Open Reading Frames*
  • Pan troglodytes
  • Peptides / genetics*
  • Peptides / metabolism
  • Protein Binding / radiation effects
  • Protein Interaction Mapping
  • Rats
  • Sequence Alignment
  • Sequence Homology, Amino Acid
  • Transcription, Genetic* / radiation effects
  • Transgenes
  • Ultraviolet Rays


  • Chromatin
  • Histones
  • Peptides
  • Diazomethane
  • Lysine