Using machine learning to study protein-protein interactions: From the uromodulin polymer to egg zona pellucida filaments

Mol Reprod Dev. 2021 Oct;88(10):686-693. doi: 10.1002/mrd.23538. Epub 2021 Sep 29.

Abstract

Neural network-based models for protein structure prediction have recently reached near-experimental accuracy and are fast becoming a powerful tool in the arsenal of biologists. As suggested by initial studies using RoseTTAFold or the ColabFold implementation of AlphaFold2, a particularly interesting future development will be the optimization of these computational methods to also routinely yield high-confidence predictions of protein-protein interactions. Here I use AlphaFold2 and ColabFold to investigate the activation and polymerization of uromodulin (UMOD)/Tamm-Horsfall protein, a zona pellucida (ZP) module-containing protein whose precursor and filamentous structures have been previously determined experimentally by X-ray crystallography and cryo-EM, respectively. Despite having no knowledge of the UMOD polymer structure (coordinates for which were neither used for model training nor as template), AlphaFold2/ColabFold are able to recapitulate a crucial conformational change underlying UMOD polymerization, as well as the general organization of protein subunits within the resulting filament. This surprising result is achieved by simply deleting from the input sequence a stretch of residues that correspond to a polymerization-inhibiting C-terminal propeptide. By mimicking in silico the activating effect of propeptide dissociation triggered by site-specific proteolysis of the protein precursor, this example has implications for the assembly of egg coat proteins and the many other molecules that also contain a ZP module. Most importantly, it shows the potential of exploiting machine learning not only to accurately predict the structures of individual proteins or complexes, but also to carry out computational experiments replicating specific molecular events.

Keywords: artificial intelligence; protein polymerization; protein-protein interactions; uromodulin; zona pellucida.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Machine Learning
  • Polymers* / analysis
  • Polymers* / metabolism
  • Uromodulin / analysis
  • Uromodulin / chemistry
  • Uromodulin / metabolism
  • Zona Pellucida* / metabolism

Substances

  • Polymers
  • Uromodulin