A novel bioinformatics pipeline to discover genes related to arbuscular mycorrhizal symbiosis based on their evolutionary conservation pattern among higher plants

BMC Plant Biol. 2014 Dec 3;14:333. doi: 10.1186/s12870-014-0333-0.


Background: Genes involved in arbuscular mycorrhizal (AM) symbiosis have been identified primarily by mutant screens, followed by identification of the mutated genes (forward genetics). In addition, a number of AM-related genes has been identified by their AM-related expression patterns, and their function has subsequently been elucidated by knock-down or knock-out approaches (reverse genetics). However, genes that are members of functionally redundant gene families, or genes that have a vital function and therefore result in lethal mutant phenotypes, are difficult to identify. If such genes are constitutively expressed and therefore escape differential expression analyses, they remain elusive. The goal of this study was to systematically search for AM-related genes with a bioinformatics strategy that is insensitive to these problems. The central element of our approach is based on the fact that many AM-related genes are conserved only among AM-competent species.

Results: Our approach involves genome-wide comparisons at the proteome level of AM-competent host species with non-mycorrhizal species. Using a clustering method we first established orthologous/paralogous relationships and subsequently identified protein clusters that contain members only of the AM-competent species. Proteins of these clusters were then analyzed in an extended set of 16 plant species and ranked based on their relatedness among AM-competent monocot and dicot species, relative to non-mycorrhizal species. In addition, we combined the information on the protein-coding sequence with gene expression data and with promoter analysis. As a result we present a list of yet uncharacterized proteins that show a strongly AM-related pattern of sequence conservation, indicating that the respective genes may have been under selection for a function in AM. Among the top candidates are three genes that encode a small family of similar receptor-like kinases that are related to the S-locus receptor kinases involved in sporophytic self-incompatibility.

Conclusions: We present a new systematic strategy of gene discovery based on conservation of the protein-coding sequence that complements classical forward and reverse genetics. This strategy can be applied to diverse other biological phenomena if species with established genome sequences fall into distinguished groups that differ in a defined functional trait of interest.

MeSH terms

  • Computational Biology*
  • Evolution, Molecular*
  • Gene Expression Profiling
  • Gene Expression Regulation, Plant*
  • Magnoliopsida / genetics*
  • Magnoliopsida / metabolism
  • Magnoliopsida / microbiology*
  • Mycorrhizae / physiology*
  • Phylogeny
  • Plant Proteins / genetics*
  • Plant Proteins / metabolism
  • Promoter Regions, Genetic / genetics
  • Symbiosis


  • Plant Proteins