isma: an R package for the integrative analysis of mutations detected by multiple pipelines

BMC Bioinformatics. 2019 Feb 28;20(1):107. doi: 10.1186/s12859-019-2701-0.

Abstract

Background: Recent comparative studies have brought to our attention how somatic mutation detection from next-generation sequencing data is still an open issue in bioinformatics, because different pipelines result in a low consensus. In this context, it is suggested to integrate results from multiple calling tools, but this operation is not trivial and the burden of merging, comparing, filtering and explaining the results demands appropriate software.

Results: We developed isma (integrative somatic mutation analysis), an R package for the integrative analysis of somatic mutations detected by multiple pipelines for matched tumor-normal samples. The package provides a series of functions to quantify the consensus, estimate the variability, underline outliers, integrate evidences from publicly available mutation catalogues and filter sites. We illustrate the capabilities of isma analysing breast cancer somatic mutations generated by The Cancer Genome Atlas (TCGA) using four pipelines.

Conclusions: Comparing different "points of view" on the same data, isma generates a unique mutation catalogue and a series of reports that underline common patterns, variability, as well as sites already catalogued by other studies (e.g. TCGA), so as to design and apply filtering strategies to screen more reliable sites. The package is available for non-commercial users at the URL https://www.itb.cnr.it/isma .

Keywords: Cancer; Data integration; Next-generation sequencing; Somatic mutations.

MeSH terms

  • Computational Biology
  • DNA Mutational Analysis / methods*
  • Genome, Human
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Mutation / genetics*
  • Neoplasms / genetics
  • Software*
  • User-Computer Interface