SigmoID: a user-friendly tool for improving bacterial genome annotation through analysis of transcription control signals

PeerJ. 2016 May 24:4:e2056. doi: 10.7717/peerj.2056. eCollection 2016.

Abstract

The majority of bacterial genome annotations are currently automated and based on a 'gene by gene' approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp.) and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where it didn't fit with regulatory information allowed us to correct product and gene names for over 300 loci.

Keywords: Genome annotation; Genome browser; Pectobacterium atrosepticum; Promoter; Sequence logo; Terminator; Transcription factor binding site.

Grants and funding

This work was supported in part by the State Research Programme “Biotechnology” within projects 2.52 and 2.24. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.