Combining natural sequence variation with high throughput mutational data to reveal protein interaction sites

PLoS Genet. 2015 Feb 11;11(2):e1004918. doi: 10.1371/journal.pgen.1004918. eCollection 2015 Feb.

Abstract

Many protein interactions are conserved among organisms despite changes in the amino acid sequences that comprise their contact sites, a property that has been used to infer the location of these sites from protein homology. In an inter-species complementation experiment, a sequence present in a homologue is substituted into a protein and tested for its ability to support function. Therefore, substitutions that inhibit function can identify interaction sites that changed over evolution. However, most of the sequence differences within a protein family remain unexplored because of the small-scale nature of these complementation approaches. Here we use existing high throughput mutational data on the in vivo function of the RRM2 domain of the Saccharomyces cerevisiae poly(A)-binding protein, Pab1, to analyze its sites of interaction. Of 197 single amino acid differences in 52 Pab1 homologues, 17 reduce the function of Pab1 when substituted into the yeast protein. The majority of these deleterious mutations interfere with the binding of the RRM2 domain to eIF4G1 and eIF4G2, isoforms of a translation initiation factor. A large-scale mutational analysis of the RRM2 domain in a two-hybrid assay for eIF4G1 binding supports these findings and identifies peripheral residues that make a smaller contribution to eIF4G1 binding. Three single amino acid substitutions in yeast Pab1 corresponding to residues from the human orthologue are deleterious and eliminate binding to the yeast eIF4G isoforms. We create a triple mutant that carries these substitutions and other humanizing substitutions that collectively support a switch in binding specificity of RRM2 from the yeast eIF4G1 to its human orthologue. Finally, we map other deleterious substitutions in Pab1 to inter-domain (RRM2-RRM1) or protein-RNA (RRM2-poly(A)) interaction sites. Thus, the combined approach of large-scale mutational data and evolutionary conservation can be used to characterize interaction sites at single amino acid resolution.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence / genetics*
  • Amino Acid Substitution / genetics
  • Binding Sites
  • DNA Mutational Analysis
  • Eukaryotic Initiation Factor-4G / genetics
  • Eukaryotic Initiation Factor-4G / metabolism
  • Evolution, Molecular*
  • Genetic Variation
  • Humans
  • Mutation / genetics*
  • Poly(A)-Binding Proteins / genetics
  • Poly(A)-Binding Proteins / metabolism*
  • Protein Binding
  • Protein Interaction Maps / genetics*
  • Protein Structure, Tertiary
  • Saccharomyces cerevisiae
  • Saccharomyces cerevisiae Proteins / genetics
  • Saccharomyces cerevisiae Proteins / metabolism*
  • Sequence Alignment

Substances

  • Eukaryotic Initiation Factor-4G
  • Poly(A)-Binding Proteins
  • Saccharomyces cerevisiae Proteins
  • TIF4631 protein, S cerevisiae
  • pab1 protein, S cerevisiae