Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Nov 4;2:e644.
doi: 10.7717/peerj.644. eCollection 2014.

Unipro UGENE NGS Pipelines and Components for Variant Calling, RNA-seq and ChIP-seq Data Analyses

Affiliations
Free PMC article

Unipro UGENE NGS Pipelines and Components for Variant Calling, RNA-seq and ChIP-seq Data Analyses

Olga Golosova et al. PeerJ. .
Free PMC article

Abstract

The advent of Next Generation Sequencing (NGS) technologies has opened new possibilities for researchers. However, the more biology becomes a data-intensive field, the more biologists have to learn how to process and analyze NGS data with complex computational tools. Even with the availability of common pipeline specifications, it is often a time-consuming and cumbersome task for a bench scientist to install and configure the pipeline tools. We believe that a unified, desktop and biologist-friendly front end to NGS data analysis tools will substantially improve productivity in this field. Here we present NGS pipelines "Variant Calling with SAMtools", "Tuxedo Pipeline for RNA-seq Data Analysis" and "Cistrome Pipeline for ChIP-seq Data Analysis" integrated into the Unipro UGENE desktop toolkit. We describe the available UGENE infrastructure that helps researchers run these pipelines on different datasets, store and investigate the results and re-run the pipelines with the same parameters. These pipeline tools are included in the UGENE NGS package. Individual blocks of these pipelines are also available for expert users to create their own advanced workflows.

Keywords: Bioinformatics; ChIP-seq; Data analysis; Next-generation sequencing; RNA-seq; Variant calling.

Figures

Figure 1
Figure 1. SAMtools workflow in a Workflow Designer window.
The workflow itself can be seen at the center of the window on the Workflow Designer scene. The left side of the window shows available workflow elements (i.e., building blocks for a workflow), grouped by categories. The “NGS: Variant Calling” group, in particular, is opened. The right side of the window displays the description of the currently selected element “Call Variants”.
Figure 2
Figure 2. Tuxedo workflow in a Workflow Designer window.
The workflow is shown at the center of the window on the Workflow Designer scene. This is a full version of the pipeline with paired-end reads used as input. The left side of the window shows building blocks available for building a new RNA-Seq analysis workflow. The right side of the window shows parameters of the selected “Find Splice Junctions with TopHat” element.
Figure 3
Figure 3. Cistrome workflow in a Workflow Designer window.
The elements of the workflow can be seen at the center of the window on the Workflow Designer scene. Building blocks for a new ChIP-Seq analysis pipeline are shown on the left side of the window. A description of the currently selected element (“Find Peaks with MACS”) is shown on the right side of the window.
Figure 4
Figure 4. Wizard.
A wizard page of the Cistrome pipeline is shown. On this page one can configure parameters of the MACS tool.
Figure 5
Figure 5. SAMtools workflow results in dashboard.
A dashboard window with a result from running the SAMtools pipeline is shown. The “Overview” page of the dashboard is opened. It contains a link to the output variants file.
Figure 6
Figure 6. Tuxedo workflow results in dashboard.
A dashboard window with the result of running the Tuxedo pipeline is shown. The “Overview” page of the dashboard is opened. The output files are grouped by the workflow elements that produced the output. One of the groups with 11 result files is opened. A user can open a result file in UGENE by clicking on it in the dashboard. Alternatively, each file can be opened outside UGENE (i.e., it can be opened by operating system) or the directory that contains the file can be opened directly from the dashboard.
Figure 7
Figure 7. Cistrome workflow results in dashboard.
A dashboard window with the result of running the Cistrome pipeline is shown. The “Overview” page of the dashboard is opened.
Figure 8
Figure 8. Cistrome result in dashboard: MACS input parameters.
A dashboard window with the result of running the Cistrome pipeline is shown. The “Input” page of the dashboard is opened. Input parameters that were used to run the pipeline are shown. Parameters of the “Find Peaks with MACS” element are currently selected.
Figure 9
Figure 9. Cistrome result in dashboard: details about the tools used.
A dashboard window with the result of running the Cistrome pipeline is shown. The “External Tools” page of the dashboard is opened. It contains details about the external tools runs (MACS, CEAS, seqpos, etc.). The details about the “go_analysis” tool are expanded.

Similar articles

See all similar articles

Cited by 21 articles

See all "Cited by" articles

References

    1. Goble CA, Bhagat J, Aleksejevs S, Cruickshank D, Michaelides D, Newman D, Borkum M, Bechhofer S, Roos M, Li P, De Roure D. myExperiment: a repository and social network for the sharing of bioinformatics workflows. Nucleic Acids Research. 2010;38:W677–W682. doi: 10.1093/nar/gkq429. - DOI - PMC - PubMed
    1. Goecks J, Nekrutenko A, Taylor J, The Galaxy Team Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biology. 2010;11(8):R86. doi: 10.1186/gb-2010-11-8-r86. - DOI - PMC - PubMed
    1. Langmead B, Salzberg S. Fast gapped-read alignment with Bowtie 2. Nature Methods. 2012;9:357–359. doi: 10.1038/nmeth.1923. - DOI - PMC - PubMed
    1. Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology. 2009;10(3):R25. doi: 10.1186/gb-2009-10-3-r25. - DOI - PMC - PubMed
    1. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. - DOI - PMC - PubMed

Grant support

This project was supported by the Office of Science Management and Operations (OSMO) of the NIAID. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

LinkOut - more resources

Feedback