SALAD database: a motif-based database of protein annotations for plant comparative genomics

Nucleic Acids Res. 2010 Jan;38(Database issue):D835-42. doi: 10.1093/nar/gkp831. Epub 2009 Oct 23.


Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database ( from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209,529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named 'SALAD on ARRAYs' to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Motifs
  • Computational Biology / methods*
  • Computational Biology / trends
  • Databases, Genetic*
  • Databases, Nucleic Acid*
  • Databases, Protein
  • Genes, Plant*
  • Genetic Markers
  • Genome, Plant
  • Genomics*
  • Information Storage and Retrieval / methods
  • Internet
  • Phylogeny
  • Plants / genetics
  • Plants / metabolism*
  • Software
  • Species Specificity


  • Genetic Markers