Unipro UGENE NGS pipelines and components for variant calling, RNA-seq and ChIP-seq data analyses

PeerJ. 2014 Nov 4:2:e644. doi: 10.7717/peerj.644. eCollection 2014.

Abstract

The advent of Next Generation Sequencing (NGS) technologies has opened new possibilities for researchers. However, the more biology becomes a data-intensive field, the more biologists have to learn how to process and analyze NGS data with complex computational tools. Even with the availability of common pipeline specifications, it is often a time-consuming and cumbersome task for a bench scientist to install and configure the pipeline tools. We believe that a unified, desktop and biologist-friendly front end to NGS data analysis tools will substantially improve productivity in this field. Here we present NGS pipelines "Variant Calling with SAMtools", "Tuxedo Pipeline for RNA-seq Data Analysis" and "Cistrome Pipeline for ChIP-seq Data Analysis" integrated into the Unipro UGENE desktop toolkit. We describe the available UGENE infrastructure that helps researchers run these pipelines on different datasets, store and investigate the results and re-run the pipelines with the same parameters. These pipeline tools are included in the UGENE NGS package. Individual blocks of these pipelines are also available for expert users to create their own advanced workflows.

Keywords: Bioinformatics; ChIP-seq; Data analysis; Next-generation sequencing; RNA-seq; Variant calling.

Grants and funding

This project was supported by the Office of Science Management and Operations (OSMO) of the NIAID. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.