End-to-End Differentiable Learning of Protein Structure

Cell Syst. 2019 Apr 24;8(4):292-301.e3. doi: 10.1016/j.cels.2019.03.006. Epub 2019 Apr 17.


Predicting protein structure from sequence is a central challenge of biochemistry. Co-evolution methods show promise, but an explicit sequence-to-structure map remains elusive. Advances in deep learning that replace complex, human-designed pipelines with differentiable models optimized end to end suggest the potential benefits of similarly reformulating structure prediction. Here, we introduce an end-to-end differentiable model for protein structure learning. The model couples local and global protein structure via geometric units that optimize global geometry without violating local covalent chemistry. We test our model using two challenging tasks: predicting novel folds without co-evolutionary data and predicting known folds without structural templates. In the first task, the model achieves state-of-the-art accuracy, and in the second, it comes within 1-2 Å; competing methods using co-evolution and experimental templates have been refined over many years, and it is likely that the differentiable approach has substantial room for further improvement, with applications ranging from drug discovery to protein design.

Keywords: biophysics; co-evolution; deep learning; geometric deep learning; homology modeling; machine learning; protein design; protein folding; protein structure prediction; structural biology.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Evolution, Molecular
  • Machine Learning*
  • Protein Folding
  • Sequence Analysis, Protein / methods*
  • Sequence Analysis, Protein / standards
  • Software*