An in silico model for identification of small RNAs in whole bacterial genomes: characterization of antisense RNAs in pathogenic Escherichia coli and Streptococcus agalactiae strains

Nucleic Acids Res. 2012 Apr;40(7):2846-61. doi: 10.1093/nar/gkr1141. Epub 2011 Dec 1.

Abstract

Characterization of small non-coding ribonucleic acids (sRNA) among the large volume of data generated by high-throughput RNA-seq or tiling microarray analyses remains a challenge. Thus, there is still a need for accurate in silico prediction methods to identify sRNAs within a given bacterial species. After years of effort, dedicated software were developed based on comparative genomic analyses or mathematical/statistical models. Although these genomic analyses enabled sRNAs in intergenic regions to be efficiently identified, they all failed to predict antisense sRNA genes (asRNA), i.e. RNA genes located on the DNA strand complementary to that which encodes the protein. The statistical models enabled any genomic region to be analyzed theorically but not efficiently. We present a new model for in silico identification of sRNA and asRNA candidates within an entire bacterial genome. This model was successfully used to analyze the Gram-negative Escherichia coli and Gram-positive Streptococcus agalactiae. In both bacteria, numerous asRNAs are transcribed from the complementary strand of genes located in pathogenicity islands, strongly suggesting that these asRNAs are regulators of the virulence expression. In particular, we characterized an asRNA that acted as an enhancer-like regulator of the type 1 fimbriae production involved in the virulence of extra-intestinal pathogenic E. coli.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Antigens, Bacterial / genetics
  • Base Pairing
  • Biofilms
  • Computer Simulation*
  • Escherichia coli / genetics*
  • Escherichia coli / pathogenicity
  • Escherichia coli / physiology
  • Escherichia coli Proteins / genetics
  • Escherichia coli Proteins / physiology
  • Fimbriae Proteins / genetics
  • Fimbriae Proteins / metabolism
  • Gene Expression Regulation, Bacterial
  • Genome, Bacterial*
  • Genomic Islands
  • Host Factor 1 Protein / physiology
  • RNA, Antisense / chemistry
  • RNA, Antisense / genetics
  • RNA, Antisense / metabolism*
  • RNA, Messenger / metabolism
  • Regulon
  • Streptococcus agalactiae / genetics*
  • Streptococcus agalactiae / pathogenicity

Substances

  • Antigens, Bacterial
  • Escherichia coli Proteins
  • Hfq protein, E coli
  • Host Factor 1 Protein
  • RNA, Antisense
  • RNA, Messenger
  • SIP protein, Streptococcus group B
  • fimD protein, E coli
  • Fimbriae Proteins