Genome-wide analysis of the MYB transcription factor superfamily in soybean

BMC Plant Biol. 2012 Jul 9:12:106. doi: 10.1186/1471-2229-12-106.

Abstract

Background: The MYB superfamily constitutes one of the most abundant groups of transcription factors described in plants. Nevertheless, their functions appear to be highly diverse and remain rather unclear. To date, no genome-wide characterization of this gene family has been conducted in a legume species. Here we report the first genome-wide analysis of the whole MYB superfamily in a legume species, soybean (Glycine max), including the gene structures, phylogeny, chromosome locations, conserved motifs, and expression patterns, as well as a comparative genomic analysis with Arabidopsis.

Results: A total of 244 R2R3-MYB genes were identified and further classified into 48 subfamilies based on a phylogenetic comparative analysis with their putative orthologs, showed both gene loss and duplication events. The phylogenetic analysis showed that most characterized MYB genes with similar functions are clustered in the same subfamily, together with the identification of orthologs by synteny analysis, functional conservation among subgroups of MYB genes was strongly indicated. The phylogenetic relationships of each subgroup of MYB genes were well supported by the highly conserved intron/exon structures and motifs outside the MYB domain. Synonymous nucleotide substitution (dN/dS) analysis showed that the soybean MYB DNA-binding domain is under strong negative selection. The chromosome distribution pattern strongly indicated that genome-wide segmental and tandem duplication contribute to the expansion of soybean MYB genes. In addition, we found that ~ 4% of soybean R2R3-MYB genes had undergone alternative splicing events, producing a variety of transcripts from a single gene, which illustrated the extremely high complexity of transcriptome regulation. Comparative expression profile analysis of R2R3-MYB genes in soybean and Arabidopsis revealed that MYB genes play conserved and various roles in plants, which is indicative of a divergence in function.

Conclusions: In this study we identified the largest MYB gene family in plants known to date. Our findings indicate that members of this large gene family may be involved in different plant biological processes, some of which may be potentially involved in legume-specific nodulation. Our comparative genomics analysis provides a solid foundation for future functional dissection of this family gene.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alternative Splicing
  • Amino Acid Motifs
  • Amino Acid Sequence
  • Arabidopsis / genetics
  • Arabidopsis / metabolism
  • Arabidopsis Proteins / genetics
  • Arabidopsis Proteins / metabolism
  • Chromosomes, Plant / genetics
  • Conserved Sequence
  • Exons
  • Gene Duplication
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation, Plant
  • Genes, Plant*
  • Glycine max / genetics*
  • Glycine max / metabolism
  • Introns
  • Molecular Sequence Data
  • Multigene Family*
  • Phylogeny
  • Selection, Genetic
  • Transcription Factors / genetics*
  • Transcription Factors / metabolism

Substances

  • Arabidopsis Proteins
  • BOTRYTIS SUSCEPTIBLE1 protein, Arabidopsis
  • Transcription Factors