HaploCoV: unsupervised classification and rapid detection of novel emerging variants of SARS-CoV-2

Commun Biol. 2023 Apr 22;6(1):443. doi: 10.1038/s42003-023-04784-4.

Abstract

Accurate and timely monitoring of the evolution of SARS-CoV-2 is crucial for identifying and tracking potentially more transmissible/virulent viral variants, and implement mitigation strategies to limit their spread. Here we introduce HaploCoV, a novel software framework that enables the exploration of SARS-CoV-2 genomic diversity through space and time, to identify novel emerging viral variants and prioritize variants of potential epidemiological interest in a rapid and unsupervised manner. HaploCoV can integrate with any classification/nomenclature and incorporates an effective scoring system for the prioritization of SARS-CoV-2 variants. By performing retrospective analyses of more than 11.5 M genome sequences we show that HaploCoV demonstrates high levels of accuracy and reproducibility and identifies the large majority of epidemiologically relevant viral variants - as flagged by international health authorities - automatically and with rapid turn-around times.Our results highlight the importance of the application of strategies based on the systematic analysis and integration of regional data for rapid identification of novel, emerging variants of SARS-CoV-2. We believe that the approach outlined in this study will contribute to relevant advances to current and future genomic surveillance methods.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19* / diagnosis
  • COVID-19* / epidemiology
  • Humans
  • Reproducibility of Results
  • Retrospective Studies
  • SARS-CoV-2 / genetics

Supplementary concepts

  • SARS-CoV-2 variants