A bioinformatics based approach to discover small RNA genes in the Escherichia coli genome

Biosystems. 2002 Mar-May;65(2-3):157-77. doi: 10.1016/s0303-2647(02)00013-8.


The recent explosion in available bacterial genome sequences has initiated the need to improve an ability to annotate important sequence and structural elements in a fast, efficient and accurate manner. In particular, small non-coding RNAs (sRNAs) have been difficult to predict. The sRNAs play an important number of structural, catalytic and regulatory roles in the cell. Although a few groups have recently published prediction methods for annotating sRNAs in bacterial genome, much remains to be done in this field. Toward the goal of developing an efficient method for predicting unknown sRNA genes in the completed Escherichia coli genome, we adopted a bioinformatics approach to search for DNA regions that contain a sigma70 promoter within a short distance of a rho-independent terminator. Among a total of 227 candidate sRNA genes initially identified, 32 were previously described sRNAs, orphan tRNAs, and partial tRNA and rRNA operons. Fifty-one are mRNAs genes encoding annotated extremely small open reading frames (ORFs) following an acceptable ribosome binding site. One hundred forty-four are potentially novel non-translatable sRNA genes. Using total RNA isolated from E. coli MG1655 cells grown under four different conditions, we verified transcripts of some of the genes by Northern hybridization. Here we summarize our data and discuss the rules and advantages/disadvantages of using this approach in annotating sRNA genes on bacterial genomes.

MeSH terms

  • Base Sequence
  • Computational Biology*
  • DNA Primers
  • Escherichia coli / genetics*
  • Genome, Bacterial*
  • Open Reading Frames
  • Polymerase Chain Reaction
  • RNA, Bacterial / genetics*


  • DNA Primers
  • RNA, Bacterial