Background: Mycoplasma hyopneumoniae causes respiratory disease in swine and contributes to the porcine respiratory disease complex, a major disease problem in the swine industry. The M. hyopneumoniae strain 232 genome is one of the smallest and best annotated microbial genomes, containing only 728 annotated genes and 691 known proteins. Standard protein databases for mass spectrometry only allow for the identification of known and predicted proteins, which if incorrect can limit our understanding of the biological processes at work. Proteogenomic mapping is a methodology which allows the entire 6-frame genome translation of an organism to be used as a mass spectrometry database to help identify unknown proteins as well as correct and confirm existing annotations. This methodology will be employed to perform an in-depth analysis of the M. hyopneumoniae proteome.
Results: Proteomic analysis indicates 483 of 691 (70%) known M. hyopneumoniae strain 232 proteins are expressed under the culture conditions given in this study. Furthermore, 171 of 328 (52%) hypothetical proteins have been confirmed. Proteogenomic mapping resulted in the identification of previously unannotated genes gatC and rpmF and 5-prime extensions to genes mhp063, mhp073, and mhp451, all conserved and annotated in other M. hyopneumoniae strains and Mycoplasma species. Gene prediction with Prodigal, a prokaryotic gene predicting program, completely supports the new genomic coordinates calculated using proteogenomic mapping.
Conclusions: Proteogenomic mapping showed that the protein coding genes of the M. hyopneumoniae strain 232 identified in this study are well annotated. Only 1.8% of mapped peptides did not correspond to genes defined by the current genome annotation. This study also illustrates how proteogenomic mapping can be an important tool to help confirm, correct and append known gene models when using a genome sequence as search space for peptide mass spectra. Using a gene prediction program which scans for a wide variety of promoters can help ensure genes are accurately predicted or not missed completely. Furthermore, protein extraction using differential detergent fractionation effectively increases the number of membrane and cytoplasmic proteins identifiable my mass spectrometry.