Accelerating Protein Folding Molecular Dynamics Using Inter-Residue Distances from Machine Learning Servers

J Chem Theory Comput. 2022 Mar 8;18(3):1929-1935. doi: 10.1021/acs.jctc.1c00916. Epub 2022 Feb 8.

Abstract

Recently, predicting the native structures of proteins has become possible using computational molecular physics (CMP)─physics-based force fields sampled with proper statistics─but only for small proteins. Algorithms with better scaling are needed. We describe ML x MELD x MD, a molecular dynamics (MD) method that inputs residue contacts derived from machine learning (ML) servers into MELD, a Bayesian accelerator that preserves detailed-balance statistics. Contacts are derived from trRosetta-predicted distance histograms (distograms) and are integrated into MELD's atomistic MD as spatial restraints through parametrized potential functions. In the CASP14 blind prediction event, ML x MELD x MD predicted 13 native structures to better than 4.5 Å error, including for 10 proteins in the range of 115-250 amino acids long. Also, the scaling of simulation time vs protein length is much better than unguided MD: tsime0.023N for ML x MELD x MD vs tsime0.168N for MD alone. This shows how machine learning information can be leveraged to advance physics-based modeling of proteins.

MeSH terms

  • Bayes Theorem
  • Computational Biology / methods
  • Machine Learning
  • Molecular Dynamics Simulation*
  • Protein Conformation
  • Protein Folding*