Modeling mutational effects on biochemical phenotypes using convolutional neural networks: application to SARS-CoV-2

iScience. 2022 Jul 15;25(7):104500. doi: 10.1016/j.isci.2022.104500. Epub 2022 Jun 2.

Abstract

Deep mutational scanning (DMS) experiments have been performed on SARS-CoV-2's spike receptor-binding domain (RBD) and human angiotensin-converting enzyme 2 (ACE2) zinc-binding peptidase domain-both central players in viral infection and evolution and antibody evasion-quantifying how mutations impact biochemical phenotypes. We modeled biochemical phenotypes from massively parallel assays, using neural networks trained on protein sequence mutations in the virus and human host. Neural networks were significantly predictive of binding affinity, protein expression, and antibody escape, learning complex interactions and higher-order features that are difficult to capture with conventional methods from structural biology. Integrating the physicochemical properties of amino acids, such as hydrophobicity and long-range non-bonded energy per atom, significantly improved prediction (empirical p < 0.01). We observed concordance of the neural network predictions with molecular dynamics (multiple 500 ns or 1 μs all-atom) simulations of the spike protein-ACE2 interface, with critical implications for the use of deep learning to dissect molecular mechanisms.

Keywords: Computational intelligence; Computational molecular modeling; Health sciences.