SARS2020: an integrated platform for identification of novel coronavirus by a consensus sequence-function model

Bioinformatics. 2021 May 23;37(8):1182-1183. doi: 10.1093/bioinformatics/btaa767.

Abstract

Motivation: The 2019 novel coronavirus outbreak has significantly affected global health and society. Thus, predicting biological function from pathogen sequence is crucial and urgently needed. However, little work has been conducted to identify viruses by the enzymes that they encode, and which are key to pathogen propagation.

Results: We built a comprehensive scientific resource, SARS2020, which integrates coronavirus-related research, genomic sequences and results of anti-viral drug trials. In addition, we built a consensus sequence-catalytic function model from which we identified the novel coronavirus as encoding the same proteinase as the severe acute respiratory syndrome virus. This data-driven sequence-based strategy will enable rapid identification of agents responsible for future epidemics.

Availabilityand implementation: SARS2020 is available at http://design.rxnfinder.org/sars2020/.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19*
  • Consensus Sequence
  • Genome
  • Humans
  • SARS-CoV-2
  • Severe acute respiratory syndrome-related coronavirus* / genetics