BioMAJ2Galaxy: automatic update of reference data in Galaxy using BioMAJ

Gigascience. 2015 May 9;4:22. doi: 10.1186/s13742-015-0063-8. eCollection 2015.

Abstract

Background: Many bioinformatics tools use reference data, such as genome assemblies or sequence databanks. Galaxy offers multiple ways to give access to this data through its web interface. However, the process of adding new reference data was customarily manual and time consuming, even more so when this data needed to be indexed in a variety of formats (e.g. Blast, Bowtie, BWA, or 2bit). BioMAJ is a widely used and stable software that is designed to automate the download and transformation of data from various sources. This data can be used directly from the command line, in more complex systems, such as Mobyle, or by using a REST API.

Findings: To ease the process of giving access to reference data in Galaxy, we have developed the BioMAJ2Galaxy module, which enables the gap between BioMAJ and Galaxy to be bridged. With this module, it is now possible to configure BioMAJ to automatically download some reference data, to then convert and/or index it in various formats, and then make this data available in a Galaxy server using data libraries or data managers.

Conclusions: The developments presented in this paper allow us to integrate the reference data in Galaxy in an automatic, reliable, and diskspace-saving way. The code is freely available on the GenOuest GitHub account (https://github.com/genouest/biomaj2galaxy).

Keywords: BioMAJ; Data libraries; Data manager; Galaxy; Reference data.

MeSH terms

  • Automation
  • Computational Biology*
  • Internet*
  • User-Computer Interface*