SLiMPrints: conservation-based discovery of functional motif fingerprints in intrinsically disordered protein regions

Nucleic Acids Res. 2012 Nov;40(21):10628-41. doi: 10.1093/nar/gks854. Epub 2012 Sep 12.

Abstract

Large portions of higher eukaryotic proteomes are intrinsically disordered, and abundant evidence suggests that these unstructured regions of proteins are rich in regulatory interaction interfaces. A major class of disordered interaction interfaces are the compact and degenerate modules known as short linear motifs (SLiMs). As a result of the difficulties associated with the experimental identification and validation of SLiMs, our understanding of these modules is limited, advocating the use of computational methods to focus experimental discovery. This article evaluates the use of evolutionary conservation as a discriminatory technique for motif discovery. A statistical framework is introduced to assess the significance of relatively conserved residues, quantifying the likelihood a residue will have a particular level of conservation given the conservation of the surrounding residues. The framework is expanded to assess the significance of groupings of conserved residues, a metric that forms the basis of SLiMPrints (short linear motif fingerprints), a de novo motif discovery tool. SLiMPrints identifies relatively overconstrained proximal groupings of residues within intrinsically disordered regions, indicative of putatively functional motifs. Finally, the human proteome is analysed to create a set of highly conserved putative motif instances, including a novel site on translation initiation factor eIF2A that may regulate translation through binding of eIF4E.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adaptor Proteins, Vesicular Transport / chemistry
  • Amino Acid Motifs*
  • Amino Acid Sequence
  • Conserved Sequence
  • Eukaryotic Initiation Factor-2 / chemistry
  • Eukaryotic Initiation Factor-2 / metabolism
  • Eukaryotic Initiation Factor-4E / metabolism
  • F-Box Proteins / chemistry
  • HeLa Cells
  • Humans
  • Molecular Sequence Data
  • Probability
  • Proteome / chemistry
  • Sequence Alignment
  • Sequence Analysis, Protein / methods*

Substances

  • Adaptor Proteins, Vesicular Transport
  • EPN2 protein, human
  • Eukaryotic Initiation Factor-2
  • Eukaryotic Initiation Factor-4E
  • F-Box Proteins
  • Proteome