Large-scale discovery of protein interactions at residue resolution using co-evolution calculated from genomic sequences

Nat Commun. 2021 Mar 2;12(1):1396. doi: 10.1038/s41467-021-21636-z.

Abstract

Increasing numbers of protein interactions have been identified in high-throughput experiments, but only a small proportion have solved structures. Recently, sequence coevolution-based approaches have led to a breakthrough in predicting monomer protein structures and protein interaction interfaces. Here, we address the challenges of large-scale interaction prediction at residue resolution with a fast alignment concatenation method and a probabilistic score for the interaction of residues. Importantly, this method (EVcomplex2) is able to assess the likelihood of a protein interaction, as we show here applied to large-scale experimental datasets where the pairwise interactions are unknown. We predict 504 interactions de novo in the E. coli membrane proteome, including 243 that are newly discovered. While EVcomplex2 does not require available structures, coevolving residue pairs can be used to produce structural models of protein interactions, as done here for membrane complexes including the Flagellar Hook-Filament Junction and the Tol/Pal complex.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Amino Acids / genetics*
  • Bacterial Proteins / chemistry
  • Bacterial Proteins / genetics*
  • Bacterial Proteins / metabolism*
  • Base Sequence
  • Escherichia coli / genetics
  • Eukaryotic Cells / metabolism
  • Evolution, Molecular*
  • Genome, Bacterial*
  • Membrane Proteins / metabolism
  • Molecular Docking Simulation
  • Protein Binding
  • Protein Interaction Mapping*
  • Proteome / metabolism

Substances

  • Amino Acids
  • Bacterial Proteins
  • Membrane Proteins
  • Proteome