Towards the characterization of the hidden world of small proteins in Staphylococcus aureus, a proteogenomics approach

PLoS Genet. 2021 Jun 1;17(6):e1009585. doi: 10.1371/journal.pgen.1009585. eCollection 2021 Jun.

Abstract

Small proteins play essential roles in bacterial physiology and virulence, however, automated algorithms for genome annotation are often not yet able to accurately predict the corresponding genes. The accuracy and reliability of genome annotations, particularly for small open reading frames (sORFs), can be significantly improved by integrating protein evidence from experimental approaches. Here we present a highly optimized and flexible bioinformatics workflow for bacterial proteogenomics covering all steps from (i) generation of protein databases, (ii) database searches and (iii) peptide-to-genome mapping to (iv) visualization of results. We used the workflow to identify high quality peptide spectrum matches (PSMs) for small proteins (≤ 100 aa, SP100) in Staphylococcus aureus Newman. Protein extracts from S. aureus were subjected to different experimental workflows for protein digestion and prefractionation and measured with highly sensitive mass spectrometers. In total, 175 proteins with up to 100 aa (SP100) were identified. Out of these 24 (ranging from 9 to 99 aa) were novel and not contained in the used genome annotation.144 SP100 are highly conserved and were found in at least 50% of the publicly available S. aureus genomes, while 127 are additionally conserved in other staphylococci. Almost half of the identified SP100 were basic, suggesting a role in binding to more acidic molecules such as nucleic acids or phospholipids.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacterial Proteins / genetics
  • Bacterial Proteins / metabolism*
  • Computer Simulation
  • Databases, Protein
  • Mass Spectrometry / methods
  • Molecular Sequence Annotation
  • Open Reading Frames
  • Peptide Hydrolases / metabolism
  • Phylogeny
  • Proteogenomics / methods*
  • Staphylococcus aureus / genetics
  • Staphylococcus aureus / metabolism*

Substances

  • Bacterial Proteins
  • Peptide Hydrolases

Grants and funding

This work was funded by the Deutsche Forschungsgemeinschaft (https://www.dfg.de/) (GRK PROCOMPAS) to SE, by the Deutsche Forschungsgemeinschaft (GRK PROCOMPAS) to LJ, by the Deutsche Forschungsgemeinschaft (INST 188/365-1 FUGG DFG) to SE, by the Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (http://www.snf.ch/de/Seiten/default.aspx) (197391) to CHA, by the Deutsche Forschungsgemeinschaft (IG 73/16-1 SPP 2002) to ZI. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.