A combinatorial amino acid code for RNA recognition by pentatricopeptide repeat proteins

PLoS Genet. 2012;8(8):e1002910. doi: 10.1371/journal.pgen.1002910. Epub 2012 Aug 16.


The pentatricopeptide repeat (PPR) is a helical repeat motif found in an exceptionally large family of RNA-binding proteins that functions in mitochondrial and chloroplast gene expression. PPR proteins harbor between 2 and 30 repeats and typically bind single-stranded RNA in a sequence-specific fashion. However, the basis for sequence-specific RNA recognition by PPR tracts has been unknown. We used computational methods to infer a code for nucleotide recognition involving two amino acids in each repeat, and we validated this model by recoding a PPR protein to bind novel RNA sequences in vitro. Our results show that PPR tracts bind RNA via a modular recognition mechanism that differs from previously described RNA-protein recognition modes and that underpins a natural library of specific protein/RNA partners of unprecedented size and diversity. These findings provide a significant step toward the prediction of native binding sites of the enormous number of PPR proteins found in nature. Furthermore, the extraordinary evolutionary plasticity of the PPR family suggests that the PPR scaffold will be particularly amenable to redesign for new sequence specificities and functions.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Base Sequence
  • Binding Sites
  • Chloroplasts / genetics
  • Chloroplasts / metabolism*
  • Electrophoretic Mobility Shift Assay
  • Evolution, Molecular
  • Mitochondria / genetics
  • Mitochondria / metabolism*
  • Molecular Sequence Data
  • Plant Proteins / chemistry*
  • Plant Proteins / genetics
  • Plant Proteins / metabolism
  • Plants / genetics
  • Plants / metabolism
  • Protein Binding
  • Protein Structure, Secondary
  • Protein Structure, Tertiary
  • RNA, Plant / chemistry*
  • RNA, Plant / metabolism
  • RNA-Binding Proteins / chemistry*
  • RNA-Binding Proteins / genetics
  • RNA-Binding Proteins / metabolism
  • Recombinant Proteins / chemistry
  • Recombinant Proteins / genetics
  • Recombinant Proteins / metabolism
  • Repetitive Sequences, Amino Acid / genetics*
  • Sequence Alignment


  • Plant Proteins
  • RNA, Plant
  • RNA-Binding Proteins
  • Recombinant Proteins

Grant support

This work was supported by National Science Foundation grant MCB-0940979 to AB, Australian Research Council grant DP120102870 to IS and CSB, and the Western Australian Government Centres of Excellence scheme. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.