MALVIRUS: an integrated application for viral variant analysis

BMC Bioinformatics. 2022 Apr 19;22(Suppl 15):625. doi: 10.1186/s12859-022-04668-0.

Abstract

Background: Being able to efficiently call variants from the increasing amount of sequencing data daily produced from multiple viral strains is of the utmost importance, as demonstrated during the COVID-19 pandemic, in order to track the spread of the viral strains across the globe.

Results: We present MALVIRUS, an easy-to-install and easy-to-use application that assists users in multiple tasks required for the analysis of a viral population, such as the SARS-CoV-2. MALVIRUS allows to: (1) construct a variant catalog consisting in a set of variations (SNPs/indels) from the population sequences, (2) efficiently genotype and annotate variants of the catalog supported by a read sample, and (3) when the considered viral species is the SARS-CoV-2, assign the input sample to the most likely Pango lineages using the genotyped variations.

Conclusions: Tests on Illumina and Nanopore samples proved the efficiency and the effectiveness of MALVIRUS in analyzing SARS-CoV-2 strain samples with respect to publicly available data provided by NCBI and the more complete dataset provided by GISAID. A comparison with state-of-the-art tools showed that MALVIRUS is always more precise and often have a better recall.

Keywords: Genotyping; Lineage classification; SARS-CoV-2; Sequence analysis; Virus.

MeSH terms

  • COVID-19*
  • Genome, Viral
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Mutation
  • Pandemics
  • Phylogeny
  • SARS-CoV-2 / genetics