BAGEL2: mining for bacteriocins in genomic data

Nucleic Acids Res. 2010 Jul;38(Web Server issue):W647-51. doi: 10.1093/nar/gkq365. Epub 2010 May 12.


Mining bacterial genomes for bacteriocins is a challenging task due to the substantial structure and sequence diversity, and generally small sizes, of these antimicrobial peptides. Major progress in the research of antimicrobial peptides and the ever-increasing quantities of genomic data, varying from (un)finished genomes to meta-genomic data, led us to develop the significantly improved genome mining software BAGEL2, as a follow-up of our previous BAGEL software. BAGEL2 identifies putative bacteriocins on the basis of conserved domains, physical properties and the presence of biosynthesis, transport and immunity genes in their genomic context. The software supports parameter-free, class-specific mining and has high-throughput capabilities. Besides building an expert validated bacteriocin database, we describe the development of novel Hidden Markov Models (HMMs) and the interpretation of combinations of HMMs via simple decision rules for prediction of bacteriocin (sub-)classes. Furthermore, the genetic context is automatically annotated based on (combinations of) PFAM domains and databases of known context genes. The scoring system was fine-tuned using expert knowledge on data derived from screening all bacterial genomes currently available at the NCBI. BAGEL2 is freely accessible at

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Motifs
  • Anti-Bacterial Agents* / chemistry
  • Bacteriocins / chemistry
  • Bacteriocins / genetics*
  • Data Mining
  • Genome, Bacterial*
  • Genomics
  • Internet
  • Open Reading Frames
  • Software*


  • Anti-Bacterial Agents
  • Bacteriocins