Experimental annotation of post-translational features and translated coding regions in the pathogen Salmonella Typhimurium

BMC Genomics. 2011 Aug 25:12:433. doi: 10.1186/1471-2164-12-433.

Abstract

Background: Complete and accurate genome annotation is crucial for comprehensive and systematic studies of biological systems. However, determining protein-coding genes for most new genomes is almost completely performed by inference using computational predictions with significant documented error rates (> 15%). Furthermore, gene prediction programs provide no information on biologically important post-translational processing events critical for protein function.

Results: We experimentally annotated the bacterial pathogen Salmonella Typhimurium 14028, using "shotgun" proteomics to accurately uncover the translational landscape and post-translational features. The data provide protein-level experimental validation for approximately half of the predicted protein-coding genes in Salmonella and suggest revisions to several genes that appear to have incorrectly assigned translational start sites, including a potential novel alternate start codon. Additionally, we uncovered 12 non-annotated genes missed by gene prediction programs, as well as evidence suggesting a role for one of these novel ORFs in Salmonella pathogenesis. We also characterized post-translational features in the Salmonella genome, including chemical modifications and proteolytic cleavages. We find that bacteria have a much larger and more complex repertoire of chemical modifications than previously thought including several novel modifications. Our in vivo proteolysis data identified more than 130 signal peptide and N-terminal methionine cleavage events critical for protein function.

Conclusion: This work highlights several ways in which application of proteomics data can improve the quality of genome annotations to facilitate novel biological insights and provides a comprehensive proteome map of Salmonella as a resource for systems analysis.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Chromatography, Liquid
  • Genome, Bacterial*
  • Molecular Sequence Annotation / methods*
  • Open Reading Frames
  • Protein Processing, Post-Translational
  • Proteolysis
  • Proteome / analysis
  • Proteomics / methods*
  • Salmonella typhimurium / genetics*
  • Tandem Mass Spectrometry

Substances

  • Proteome