Newfound Coding Potential of Transcripts Unveils Missing Members of Human Protein Communities

Genomics Proteomics Bioinformatics. 2023 Jun;21(3):515-534. doi: 10.1016/j.gpb.2022.09.008. Epub 2022 Sep 30.

Abstract

Recent proteogenomic approaches have led to the discovery that regions of the transcriptome previously annotated as non-coding regions [i.e., untranslated regions (UTRs), open reading frames overlapping annotated coding sequences in a different reading frame, and non-coding RNAs] frequently encode proteins, termed alternative proteins (altProts). This suggests that previously identified protein-protein interaction (PPI) networks are partially incomplete because altProts are not present in conventional protein databases. Here, we used the proteogenomic resource OpenProt and a combined spectrum- and peptide-centric analysis for the re-analysis of a high-throughput human network proteomics dataset, thereby revealing the presence of 261 altProts in the network. We found 19 genes encoding both an annotated (reference) and an alternative protein interacting with each other. Of the 117 altProts encoded by pseudogenes, 38 are direct interactors of reference proteins encoded by their respective parental genes. Finally, we experimentally validate several interactions involving altProts. These data improve the blueprints of the human PPI network and suggest functional roles for hundreds of altProts.

Keywords: Affinity purification mass spectrometry; Alternative protein; Protein network; Protein–protein interaction; Pseudogene.

MeSH terms

  • Databases, Protein
  • Humans
  • Open Reading Frames
  • Peptides
  • Proteins* / genetics
  • Proteomics*

Substances

  • Proteins
  • Peptides