New analysis pipeline for high-throughput domain-peptide affinity experiments improves SH2 interaction data

J Biol Chem. 2020 Aug 7;295(32):11346-11363. doi: 10.1074/jbc.RA120.012503. Epub 2020 Jun 15.


Protein domain interactions with short linear peptides, such as those of the Src homology 2 (SH2) domain with phosphotyrosine-containing peptide motifs (pTyr), are ubiquitous and important to many biochemical processes of the cell. The desire to map and quantify these interactions has resulted in the development of high-throughput (HTP) quantitative measurement techniques, such as microarray or fluorescence polarization assays. For example, in the last 15 years, experiments have progressed from measuring single interactions to covering 500,000 of the 5.5 million possible SH2-pTyr interactions in the human proteome. However, high variability in affinity measurements and disagreements about positive interactions between published data sets led us here to reevaluate the analysis methods and raw data of published SH2-pTyr HTP experiments. We identified several opportunities for improving the identification of positive and negative interactions and the accuracy of affinity measurements. We implemented model-fitting techniques that are more statistically appropriate for the nonlinear SH2-pTyr interaction data. We also developed a method to account for protein concentration errors due to impurities and degradation or protein inactivity and aggregation. Our revised analysis increases the reported affinity accuracy, reduces the false-negative rate, and increases the amount of useful data by adding reliable true-negative results. We demonstrate improvement in classification of binding versus nonbinding when using machine-learning techniques, suggesting improved coherence in the reanalyzed data sets. We present revised SH2-pTyr affinity results and propose a new analysis pipeline for future HTP measurements of domain-peptide interactions.

Keywords: Src homology 2 domain (SH2 domain); affinity; best practices; cell signaling; epidermal growth factor receptor (EGFR); high-throughput; kinetics; mathematical modeling; peptide interaction; phosphotyrosine; phosphotyrosine signaling; protein-protein interaction.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • High-Throughput Screening Assays / methods*
  • Humans
  • Peptides / chemistry*
  • Protein Binding
  • Reproducibility of Results
  • src Homology Domains*


  • Peptides

Associated data

  • figshare/10.6084/m9.figshare.11482686.v1
  • figshare/10.6084/m9.figshare.12326609.v1