MotSASi: Functional short linear motifs (SLiMs) prediction based on genomic single nucleotide variants and structural data

Biochimie. 2022 Jun:197:59-73. doi: 10.1016/j.biochi.2022.02.002. Epub 2022 Feb 5.

Abstract

Short linear motifs (SLiMs) are key to cell physiology mediating reversible protein-protein interactions. Precise identification of SLiMs remains a challenge, being the main drawback of most bioinformatic prediction tools, their low specificity (high number of false positives). An important, usually overlooked, aspect is the relation between SLiMs mutations and disease. The presence of variants in each residue position can be used to assess the relevance of the corresponding residue(s) for protein function, and its (in)tolerance to change. In the present work, we combined sequence variant information and structural analysis of the energetic impact of single amino acid substitution (SAS) in SLiM-Receptor complex structure, and showed that it improves prediction of true functional SLiMs. Our strategy is based on building a SAS tolerance matrix that shows, for each position, whether one of the possible 19 SAS is tolerated or not. Herein we present the MotSASi strategy and analyze in detail 3 SLiMs involved in intracellular protein trafficking (phospho-independent tyrosine-based motif (NPx[Y/F]), type 1 PDZ-binding motif ([S/T]x[V/I/L]COOH) and tryptophan-acidic motif ([L/M]xW[D/E])). Our results show that inclusion of variant and structure information improves both prediction of true SLiMs and rejection of false positives, while also allowing better classification of variants inside SLiMs, a result with a direct impact in clinical genomics.

Keywords: ClinVar; FoldX; GnomAD; Short linear motifs (SLiMs); Single amino acid substitution (SAS); Single nucleotide variants (SNV).

MeSH terms

  • Amino Acid Motifs
  • Amino Acid Sequence
  • Computational Biology* / methods
  • Genomics*
  • Nucleotides

Substances

  • Nucleotides