dictyExpress: a web-based platform for sequence data management and analytics in Dictyostelium and beyond

BMC Bioinformatics. 2017 Jun 2;18(1):291. doi: 10.1186/s12859-017-1706-9.

Abstract

Background: Dictyostelium discoideum, a soil-dwelling social amoeba, is a model for the study of numerous biological processes. Research in the field has benefited mightily from the adoption of next-generation sequencing for genomics and transcriptomics. Dictyostelium biologists now face the widespread challenges of analyzing and exploring high dimensional data sets to generate hypotheses and discovering novel insights.

Results: We present dictyExpress (2.0), a web application designed for exploratory analysis of gene expression data, as well as data from related experiments such as Chromatin Immunoprecipitation sequencing (ChIP-Seq). The application features visualization modules that include time course expression profiles, clustering, gene ontology enrichment analysis, differential expression analysis and comparison of experiments. All visualizations are interactive and interconnected, such that the selection of genes in one module propagates instantly to visualizations in other modules. dictyExpress currently stores the data from over 800 Dictyostelium experiments and is embedded within a general-purpose software framework for management of next-generation sequencing data. dictyExpress allows users to explore their data in a broader context by reciprocal linking with dictyBase-a repository of Dictyostelium genomic data. In addition, we introduce a companion application called GenBoard, an intuitive graphic user interface for data management and bioinformatics analysis.

Conclusions: dictyExpress and GenBoard enable broad adoption of next generation sequencing based inquiries by the Dictyostelium research community. Labs without the means to undertake deep sequencing projects can mine the data available to the public. The entire information flow, from raw sequence data to hypothesis testing, can be accomplished in an efficient workspace. The software framework is generalizable and represents a useful approach for any research community. To encourage more wide usage, the backend is open-source, available for extension and further development by bioinformaticians and data scientists.

Keywords: Bioinformatics; ChIP-seq; Differential gene expression; Platform; RNA-seq; Visual analytics.

MeSH terms

  • Chromatin Immunoprecipitation
  • Cluster Analysis
  • Dictyostelium / genetics
  • Dictyostelium / metabolism*
  • High-Throughput Nucleotide Sequencing
  • Internet
  • Sequence Analysis, RNA
  • Transcriptome
  • User-Computer Interface*