Discovery of phosphorylation motif mixtures in phosphoproteomics data

Bioinformatics. 2009 Jan 1;25(1):14-21. doi: 10.1093/bioinformatics/btn569. Epub 2008 Nov 7.

Abstract

Motivation: Modification of proteins via phosphorylation is a primary mechanism for signal transduction in cells. Phosphorylation sites on proteins are determined in part through particular patterns, or motifs, present in the amino acid sequence.

Results: We describe an algorithm that simultaneously discovers multiple motifs in a set of peptides that were phosphorylated by several different kinases. Such sets of peptides are routinely produced in proteomics experiments.Our motif-finding algorithm uses the principle of minimum description length to determine a mixture of sequence motifs that distinguish a foreground set of phosphopeptides from a background set of unphosphorylated peptides. We show that our algorithm outperforms existing motif-finding algorithms on synthetic datasets consisting of mixtures of known phosphorylation sites. We also derive a motif specificity score that quantifies whether or not the phosphoproteins containing an instance of a motif have a significant number of known interactions. Application of our motif-finding algorithm to recently published human and mouse proteomic studies recovers several known phosphorylation motifs and reveals a number of novel motifs that are enriched for interactions with a particular kinase or phosphatase. Our tools provide a new approach for uncovering the sequence specificities of uncharacterized kinases or phosphatases.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Amino Acid Motifs
  • Amino Acid Sequence
  • Animals
  • Cell Line, Tumor
  • Databases, Protein*
  • Humans
  • Mice
  • Molecular Sequence Data
  • Phosphoproteins / chemistry*
  • Phosphorylation
  • Proteomics*
  • ROC Curve
  • Receptor, ErbB-2 / chemistry
  • Reproducibility of Results

Substances

  • Phosphoproteins
  • Receptor, ErbB-2