"Pseudo-pseudogenes" in bacterial genomes: Proteogenomics reveals a wide but low protein expression of pseudogenes in Salmonella enterica

Nucleic Acids Res. 2022 May 20;50(9):5158-5170. doi: 10.1093/nar/gkac302.

Abstract

Pseudogenes (genes disrupted by frameshift or in-frame stop codons) are ubiquitously present in the bacterial genome and considered as nonfunctional fossil. Here, we used RNA-seq and mass-spectrometry technologies to measure the transcriptomes and proteomes of Salmonella enterica serovars Paratyphi A and Typhi. All pseudogenes' mRNA sequences remained disrupted, and were present at comparable levels to their intact homologs. At the protein level, however, 101 out of 161 pseudogenes suggested successful translation, with their low expression regardless of growth conditions, genetic background and pseudogenization causes. The majority of frameshifting detected was compensatory for -1 frameshift mutations. Readthrough of in-frame stop codons primarily involved UAG; and cytosine was the most frequent base adjacent to the codon. Using a fluorescence reporter system, fifteen pseudogenes were confirmed to express successfully in vivo in Escherichia coli. Expression of the intact copy of the fifteen pseudogenes in S. Typhi affected bacterial pathogenesis as revealed in human macrophage and epithelial cell infection models. The above findings suggest the need to revisit the nonstandard translation mechanism as well as the biological role of pseudogenes in the bacterial genome.

MeSH terms

  • Bacterial Proteins / genetics
  • Bacterial Proteins / metabolism
  • Codon, Terminator
  • Gene Expression
  • Genome, Bacterial
  • Proteogenomics*
  • Pseudogenes* / genetics
  • Salmonella paratyphi A / genetics*
  • Salmonella typhi / genetics*

Substances

  • Bacterial Proteins
  • Codon, Terminator