Multi-integrated approach for unraveling small open reading frames potentially associated with secondary metabolism in Streptomyces

mSystems. 2023 Sep 15;e0024523. doi: 10.1128/msystems.00245-23. Online ahead of print.


Small open reading frames (smORFs) are widely distributed in various living organisms. However, their functions remain largely unexplored. In addition, annotation and detection of smORFs are limited using existing methods and hindered by their specific properties. In this study, we systematically investigated smORFs and smORF-encoded peptides (SEPs) in Streptomyces, which are well-known bacterial producers of diverse bioactive secondary metabolites. We established a peptidogenomic workflow based on multi-integrated comprehensive database search and database-independent de novo sequencing to identify smORFs in Streptomyces xinghaiensis NRRL B-24674T (S187). In addition, we described SEPome related to the secondary metabolism, which include 68 novel SEPs and 79 common smORFs with Streptomyces coelicolor A3 (2). Functional analysis of universal smORFs revealed enrichment in biosynthetic processes, stress response, ribosomes, and nucleic acid binding. Meanwhile, 5 Cryptic smORF-encoded Peptides (CSEPs) distributed in non-annotated regions of the genome, and non-coding RNAs could encode for CSEPs. A total of 66 new RNAs, including 32 non-coding RNAs (ncRNAs) were revealed, and 4 ncRNA-encoded peptides were identified. Furthermore, an investigation of carbon metabolism showed that NagE functions in spore formation and secondary metabolism in Streptomyces. Particularly, NagE was observed to function in the biosynthesis of anti-complement agents in S. xinghaiensis, suggesting a novel role of the phosphoenolpyruvate phosphotransferase system in microbial secondary metabolism. We thus provide an effective strategy for analyzing public data sets of model strains to identify smORFs for non-model species. The ncRNAs and SEPs present rich sources for engineering streptomycetes to produce bioactive compounds. IMPORTANCE Due to their small size and special chemical features, small open reading frame (smORF)-encoding peptides (SEPs) are often neglected. However, they may play critical roles in regulating gene expression, enzyme activity, and metabolite production. Studies on bacterial microproteins have mainly focused on pathogenic bacteria, which are importance to systematically investigate SEPs in streptomycetes and are rich sources of bioactive secondary metabolites. Our study is the first to perform a global identification of smORFs in streptomycetes. We established a peptidogenomic workflow for non-model microbial strains and identified multiple novel smORFs that are potentially linked to secondary metabolism in streptomycetes. Our multi-integrated approach in this study is meaningful to improve the quality and quantity of the detected smORFs. Ultimately, the workflow we established could be extended to other organisms and would benefit the genome mining of microproteins with critical functions for regulation and engineering useful microorganisms.

Keywords: Streptomyces; de novo sequencing; peptidogenomics; secondary metabolism; smORF-encoded peptides.