Biophysical characterization of high-confidence, small human proteins

bioRxiv [Preprint]. 2024 Apr 15:2024.04.12.589296. doi: 10.1101/2024.04.12.589296.

Abstract

Significant efforts have been made to characterize the biophysical properties of proteins. Small proteins have received less attention because their annotation has historically been less reliable. However, recent improvements in sequencing, proteomics, and bioinformatics techniques have led to the high-confidence annotation of small open reading frames (smORFs) that encode for functional proteins, producing smORF-encoded proteins (SEPs). SEPs have been found to perform critical functions in several species, including humans. While significant efforts have been made to annotate SEPs, less attention has been given to the biophysical properties of these proteins. We characterized the distributions of predicted and curated biophysical properties, including sequence composition, structure, localization, function, and disease association of a conservative list of previously identified human SEPs. We found significant differences between SEPs and both larger proteins and control sets. Additionally, we provide an example of how our characterization of biophysical properties can contribute to distinguishing protein-coding smORFs from non-coding ones in otherwise ambiguous cases.

Keywords: Amino Acids; Computational Biology; Genome; Human; Peptides; Proteins.

Publication types

  • Preprint