An integrated approach for finding overlooked genes in Shigella

PLoS One. 2011 Apr 5;6(4):e18509. doi: 10.1371/journal.pone.0018509.

Abstract

Background: The completion of numerous genome sequences introduced an era of whole-genome study. However, many genes are missed during genome annotation, including small RNAs (sRNAs) and small open reading frames (sORFs). In order to improve genome annotation, we aimed to identify novel sRNAs and sORFs in Shigella, the principal etiologic agents of bacillary dysentery.

Methodology/principal findings: We identified 64 sRNAs in Shigella, which were experimentally validated in other bacteria based on sequence conservation. We employed computer-based and tiling array-based methods to search for sRNAs, followed by RT-PCR and northern blots, to identify nine sRNAs in Shigella flexneri strain 301 (Sf301) and 256 regions containing possible sRNA genes. We found 29 candidate sORFs using bioinformatic prediction, array hybridization and RT-PCR verification. We experimentally validated 557 (57.9%) DOOR operon predictions in the chromosomes of Sf301 and 46 (76.7%) in virulence plasmid.We found 40 additional co-expressed gene pairs that were not predicted by DOOR.

Conclusions/significance: We provide an updated and comprehensive annotation of the Shigella genome. Our study increased the expected numbers of sORFs and sRNAs, which will impact on future functional genomics and proteomics studies. Our method can be used for large scale reannotation of sRNAs and sORFs in any microbe with a known genome sequence.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Blotting, Northern
  • Computational Biology / methods*
  • Genes, Bacterial / genetics*
  • Genomics
  • Nucleic Acid Hybridization
  • Open Reading Frames / genetics
  • Operon / genetics
  • RNA, Bacterial / genetics
  • RNA, Untranslated / genetics
  • Reverse Transcriptase Polymerase Chain Reaction
  • Shigella / genetics*
  • Systems Integration*

Substances

  • RNA, Bacterial
  • RNA, Untranslated