Machine Learning Generation of Dynamic Protein Conformational Ensembles

Molecules. 2023 May 12;28(10):4047. doi: 10.3390/molecules28104047.

Abstract

Machine learning has achieved remarkable success across a broad range of scientific and engineering disciplines, particularly its use for predicting native protein structures from sequence information alone. However, biomolecules are inherently dynamic, and there is a pressing need for accurate predictions of dynamic structural ensembles across multiple functional levels. These problems range from the relatively well-defined task of predicting conformational dynamics around the native state of a protein, which traditional molecular dynamics (MD) simulations are particularly adept at handling, to generating large-scale conformational transitions connecting distinct functional states of structured proteins or numerous marginally stable states within the dynamic ensembles of intrinsically disordered proteins. Machine learning has been increasingly applied to learn low-dimensional representations of protein conformational spaces, which can then be used to drive additional MD sampling or directly generate novel conformations. These methods promise to greatly reduce the computational cost of generating dynamic protein ensembles, compared to traditional MD simulations. In this review, we examine recent progress in machine learning approaches towards generative modeling of dynamic protein ensembles and emphasize the crucial importance of integrating advances in machine learning, structural data, and physical principles to achieve these ambitious goals.

Keywords: Boltzmann generator; autoencoder; collective variable; dimension reduction; enhanced sampling; generative adversarial network; latent space; neural network; physics-informed machine learning; transfer learning.

Publication types

  • Review

MeSH terms

  • Intrinsically Disordered Proteins* / chemistry
  • Machine Learning
  • Molecular Dynamics Simulation
  • Protein Conformation

Substances

  • Intrinsically Disordered Proteins