TSSer: an automated method to identify transcription start sites in prokaryotic genomes from differential RNA sequencing data

Bioinformatics. 2014 Apr 1;30(7):971-4. doi: 10.1093/bioinformatics/btt752. Epub 2013 Dec 25.

Abstract

Motivation: Accurate identification of transcription start sites (TSSs) is an essential step in the analysis of transcription regulatory networks. In higher eukaryotes, the capped analysis of gene expression technology enabled comprehensive annotation of TSSs in genomes such as those of mice and humans. In bacteria, an equivalent approach, termed differential RNA sequencing (dRNA-seq), has recently been proposed, but the application of this approach to a large number of genomes is hindered by the paucity of computational analysis methods. With few exceptions, when the method has been used, annotation of TSSs has been largely done manually.

Results: In this work, we present a computational method called 'TSSer' that enables the automatic inference of TSSs from dRNA-seq data. The method rests on a probabilistic framework for identifying both genomic positions that are preferentially enriched in the dRNA-seq data as well as preferentially captured relative to neighboring genomic regions. Evaluating our approach for TSS calling on several publicly available datasets, we find that TSSer achieves high consistency with the curated lists of annotated TSSs, but identifies many additional TSSs. Therefore, TSSer can accelerate genome-wide identification of TSSs in bacterial genomes and can aid in further characterization of bacterial transcription regulatory networks.

Availability: TSSer is freely available under GPL license at http://www.clipz.unibas.ch/TSSer/index.php

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Automation, Laboratory / methods*
  • Base Sequence
  • Genome, Bacterial*
  • Genomics
  • Sequence Analysis, RNA / methods*
  • Transcription Initiation Site*