Peptomics, identification of novel cationic Arabidopsis peptides with conserved sequence motifs

In Silico Biol. 2002;2(4):441-51.


Few plant peptides involved in intercellular communication have been experimentally isolated. Sequence analysis of the Arabidopsis thaliana genome has revealed numerous transmembrane receptors predicted to bind proteinacious ligands, emphasizing the importance of identifying peptides with signaling function. Annotation of the Arabidopsis genome sequence has made it possible to identify peptide-encoding genes. However, such annotational identification is impeded because small genes are poorly predicted by gene-prediction algorithms, thus prompting the alternative approaches described here. We initially performed a systematic analysis of short polypeptides encoded by annotated genes on two Arabidopsis chromosomes using SignalP to identify potentially secreted peptides. Subsequent homology searches with selected, putatively secreted peptides, led to the identification of a potential, large Arabidopsis family of 34 genes. The predicted peptides are characterized by a conserved C-terminal sequence motif and additional primary structure conservation in a core region. The majority of these genes had not previously been annotated. A subset of the predicted peptides show high overall sequence similarity to Rapid Alkalinization Factor (RALF), a peptide isolated from tobacco. We therefore refer to this peptide family as RALFL for RALF-Like. RT-PCR analysis confirmed that several of the Arabidopsis genes are expressed and that their expression patterns vary. The identification of a large gene family in the genome of the model organism Arabidopsis thaliana demonstrates that a combination of systematic analysis and homology searching can contribute to peptide discovery.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Motifs
  • Amino Acid Sequence
  • Arabidopsis / metabolism*
  • Cations
  • Conserved Sequence
  • Expressed Sequence Tags
  • Ligands
  • Models, Genetic
  • Molecular Sequence Data
  • Open Reading Frames
  • Peptides / chemistry*
  • Proteomics
  • Reverse Transcriptase Polymerase Chain Reaction
  • Sequence Homology, Amino Acid
  • Software


  • Cations
  • Ligands
  • Peptides