Using shotgun sequence data to find active restriction enzyme genes

Nucleic Acids Res. 2009 Jan;37(1):e1. doi: 10.1093/nar/gkn883. Epub 2008 Nov 6.


Whole genome shotgun sequence analysis has become the standard method for beginning to determine a genome sequence. The preparation of the shotgun sequence clones is, in fact, a biological experiment. It determines which segments of the genome can be cloned into Escherichia coli and which cannot. By analyzing the complete set of sequences from such an experiment, it is possible to identify genes lethal to E. coli. Among this set are genes encoding restriction enzymes which, when active in E. coli, lead to cell death by cleaving the E. coli genome at the restriction enzyme recognition sites. By analyzing shotgun sequence data sets we show that this is a reliable method to detect active restriction enzyme genes in newly sequenced genomes, thereby facilitating functional annotation. Active restriction enzyme genes have been identified, and their activity demonstrated biochemically, in the sequenced genomes of Methanocaldococcus jannaschii, Bacillus cereus ATCC 10987 and Methylococcus capsulatus.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacillus cereus
  • DNA Modification Methylases / genetics
  • DNA Restriction Enzymes / genetics*
  • Genome, Bacterial*
  • Genomics / methods*
  • Haemophilus influenzae / genetics
  • Helicobacter pylori
  • Methanococcales
  • Methylococcus capsulatus


  • DNA Modification Methylases
  • DNA Restriction Enzymes