Sapporo: A workflow execution service that encourages the reuse of workflows in various languages in bioinformatics

F1000Res. 2024 Jun 24:11:889. doi: 10.12688/f1000research.122924.2. eCollection 2022.

Abstract

The increased demand for efficient computation in data analysis encourages researchers in biomedical science to use workflow systems. Workflow systems, or so-called workflow languages, are used for the description and execution of a set of data analysis steps. Workflow systems increase the productivity of researchers, specifically in fields that use high-throughput DNA sequencing applications, where scalable computation is required. As systems have improved the portability of data analysis workflows, research communities are able to share workflows to reduce the cost of building ordinary analysis procedures. However, having multiple workflow systems in a research field has resulted in the distribution of efforts across different workflow system communities. As each workflow system has its unique characteristics, it is not feasible to learn every single system in order to use publicly shared workflows. Thus, we developed Sapporo, an application to provide a unified layer of workflow execution upon the differences of various workflow systems. Sapporo has two components: an application programming interface (API) that receives the request of a workflow run and a browser-based client for the API. The API follows the Workflow Execution Service API standard proposed by the Global Alliance for Genomics and Health. The current implementation supports the execution of workflows in four languages: Common Workflow Language, Workflow Description Language, Snakemake, and Nextflow. With its extensible and scalable design, Sapporo can support the research community in utilizing valuable resources for data analysis.

Keywords: open science; workflow; workflow execution service; workflow language.

MeSH terms

  • Computational Biology* / methods
  • Programming Languages
  • Software*
  • Workflow*

Grants and funding

This study was supported by JSPS KAKENHI (Grant Number 20J22439; assigned to H.S.), the Life Science Database Integration Project, and the National Bioscience Database Center (NBDC) of the Japan Science and Technology Agency (JST). DDBJ is supported by the Research Organization of Information and Systems (ROIS) under the Ministry of Education, Culture, Sports, Science, and Technology (MEXT) of Japan. This study was also supported by the CREST program of the Japan Science and Technology Agency (Grant Number JPMJCR17A1, assigned to T.I.).