MetaGeneAnnotator: detecting species-specific patterns of ribosomal binding site for precise gene prediction in anonymous prokaryotic and phage genomes

DNA Res. 2008 Dec;15(6):387-96. doi: 10.1093/dnares/dsn027. Epub 2008 Oct 21.


Recent advances in DNA sequencers are accelerating genome sequencing, especially in microbes, and complete and draft genomes from various species have been sequenced in rapid succession. Here, we present a comprehensive gene prediction tool, the MetaGeneAnnotator (MGA), which precisely predicts all kinds of prokaryotic genes from a single or a set of anonymous genomic sequences having a variety of lengths. The MGA integrates statistical models of prophage genes, in addition to those of bacterial and archaeal genes, and also uses a self-training model from input sequences for predictions. As a result, the MGA sensitively detects not only typical genes but also atypical genes, such as horizontally transferred and prophage genes in a prokaryotic genome. In this paper, we also propose a novel approach for analyzing the ribosomal binding site (RBS), which enables us to detect species-specific patterns of the RBSs. The MGA has the ingenious RBS model based on this approach, and precisely predicts translation starts of genes. The MGA also succeeds in improving prediction accuracies for short sequences by using the adapted RBS models (96% sensitivity and 93% specificity for 700 bp fragments). These features of the MGA expedite wide ranges of microbial genome studies, such as genome annotations and metagenome analyses.

MeSH terms

  • Algorithms
  • Bacteria / genetics
  • Bacteriophages / genetics
  • Binding Sites
  • Computational Biology / methods*
  • Genes, Bacterial*
  • Genes, Viral*
  • Genome, Bacterial / genetics*
  • Genome, Viral / genetics*
  • Plasmids / genetics
  • Predictive Value of Tests
  • Protein Biosynthesis
  • Ribosomes / metabolism*
  • Species Specificity