SHARK-capture identifies functional motifs in intrinsically disordered protein regions

Protein Sci. 2025 Apr;34(4):e70091. doi: 10.1002/pro.70091.

Abstract

Increasing insights into how sequence motifs in intrinsically disordered regions (IDRs) provide functions underscore the need for systematic motif detection. Contrary to structured regions where motifs can be readily identified from sequence alignments, the rapid evolution of IDRs limits the usage of alignment-based tools in reliably detecting motifs within. Here, we developed SHARK-capture, an alignment-free motif detection tool designed for difficult-to-align regions. SHARK-capture innovates on word-based methods by flexibly incorporating amino acid physicochemistry to assess motif similarity without requiring rigid definitions of equivalency groups. SHARK-capture offers consistently strong performance in a systematic benchmark, with superior residue-level performance. SHARK-capture identified known functional motifs across orthologs of the microtubule-associated zinc finger protein BuGZ. We also identified a short motif in the IDR of S. cerevisiae RNA helicase Ded1p, which we experimentally verified to be capable of promoting ATPase activity. Our improved performance allows us to systematically calculate 10,889 motifs for 2695 yeast IDRs and provide it as a resource. SHARK-capture offers the most precise tool yet for the systematic identification of conserved regions in IDRs and is freely available as a Python package (https://pypi.org/project/bio-shark/) and on https://git.mpi-cbg.de/tothpetroczylab/shark.

Keywords: IDRs; alignment‐free; motif detection; sequence‐to‐function.

MeSH terms

  • Amino Acid Motifs
  • Computational Biology* / methods
  • Intrinsically Disordered Proteins* / chemistry
  • Intrinsically Disordered Proteins* / genetics
  • Intrinsically Disordered Proteins* / metabolism
  • Saccharomyces cerevisiae / chemistry
  • Saccharomyces cerevisiae Proteins* / chemistry
  • Saccharomyces cerevisiae Proteins* / genetics
  • Saccharomyces cerevisiae Proteins* / metabolism
  • Sequence Alignment
  • Software*

Substances

  • Intrinsically Disordered Proteins
  • Saccharomyces cerevisiae Proteins