N-Terminomics for the Identification of In Vitro Substrates and Cleavage Site Specificity of the SARS-CoV-2 Main Protease

Proteomics. 2021 Jan;21(2):e2000246. doi: 10.1002/pmic.202000246. Epub 2020 Nov 17.


The genome of coronaviruses, including SARS-CoV-2, encodes for two proteases, a papain like (PLpro ) protease and the so-called main protease (Mpro ), a chymotrypsin-like cysteine protease, also named 3CLpro or non-structural protein 5 (nsp5). Mpro is activated by autoproteolysis and is the main protease responsible for cutting the viral polyprotein into functional units. Aside from this, it is described that Mpro proteases are also capable of processing host proteins, including those involved in the host innate immune response. To identify substrates of the three main proteases from SARS-CoV, SARS-CoV-2, and hCoV-NL63 coronviruses, an LC-MS based N-terminomics in vitro analysis is performed using recombinantly expressed proteases and lung epithelial and endothelial cell lysates as substrate pools. For SARS-CoV-2 Mpro , 445 cleavage events from more than 300 proteins are identified, while 151 and 331 Mpro derived cleavage events are identified for SARS-CoV and hCoV-NL63, respectively. These data enable to better understand the cleavage site specificity of the viral proteases and will help to identify novel substrates in vivo. All data are available via ProteomeXchange with identifier PXD021406.

Keywords: Covid19; LC-MS; isobaric labeling; protease substrates; terminomics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19 / metabolism
  • COVID-19 / virology*
  • Cells, Cultured
  • Coronavirus 3C Proteases / metabolism*
  • Coronavirus NL63, Human / enzymology*
  • Endothelial Cells / metabolism
  • Endothelial Cells / virology
  • Epithelial Cells / metabolism
  • Epithelial Cells / virology
  • Eukaryotic Initiation Factor-4G / metabolism
  • Host-Pathogen Interactions
  • Humans
  • Lung / metabolism
  • Lung / virology
  • Peptide Fragments / analysis*
  • SARS-CoV-2 / enzymology*
  • Severe acute respiratory syndrome-related coronavirus / enzymology*
  • Substrate Specificity
  • Viral Proteins / metabolism*


  • EIF4G1 protein, human
  • Eukaryotic Initiation Factor-4G
  • Peptide Fragments
  • Viral Proteins
  • Coronavirus 3C Proteases