An all-atom protein generative model

Proc Natl Acad Sci U S A. 2024 Jul 2;121(27):e2311500121. doi: 10.1073/pnas.2311500121. Epub 2024 Jun 25.

Abstract

Proteins mediate their functions through chemical interactions; modeling these interactions, which are typically through sidechains, is an important need in protein design. However, constructing an all-atom generative model requires an appropriate scheme for managing the jointly continuous and discrete nature of proteins encoded in the structure and sequence. We describe an all-atom diffusion model of protein structure, Protpardelle, which represents all sidechain states at once as a "superposition" state; superpositions defining a protein are collapsed into individual residue types and conformations during sample generation. When combined with sequence design methods, our model is able to codesign all-atom protein structure and sequence. Generated proteins are of good quality under the typical quality, diversity, and novelty metrics, and sidechains reproduce the chemical features and behavior of natural proteins. Finally, we explore the potential of our model to conduct all-atom protein design and scaffold functional motifs in a backbone- and rotamer-free way.

Keywords: full-atom model; generative modeling; protein design; protein structure; sidechain generation.

MeSH terms

  • Amino Acid Sequence
  • Models, Molecular*
  • Protein Conformation*
  • Proteins* / chemistry

Substances

  • Proteins