Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2011 Jan 28;6(1):e16219.
doi: 10.1371/journal.pone.0016219.

Phylogenetic comparison of F-Box (FBX) gene superfamily within the plant kingdom reveals divergent evolutionary histories indicative of genomic drift

Affiliations
Comparative Study

Phylogenetic comparison of F-Box (FBX) gene superfamily within the plant kingdom reveals divergent evolutionary histories indicative of genomic drift

Zhihua Hua et al. PLoS One. .

Abstract

The emergence of multigene families has been hypothesized as a major contributor to the evolution of complex traits and speciation. To help understand how such multigene families arose and diverged during plant evolution, we examined the phylogenetic relationships of F-Box (FBX) genes, one of the largest and most polymorphic superfamilies known in the plant kingdom. FBX proteins comprise the target recognition subunit of SCF-type ubiquitin-protein ligases, where they individually recruit specific substrates for ubiquitylation. Through the extensive analysis of 10,811 FBX loci from 18 plant species, ranging from the alga Chlamydomonas reinhardtii to numerous monocots and eudicots, we discovered strikingly diverse evolutionary histories. The number of FBX loci varies widely and appears independent of the growth habit and life cycle of land plants, with a little as 198 predicted for Carica papaya to as many as 1350 predicted for Arabidopsis lyrata. This number differs substantially even among closely related species, with evidence for extensive gains/losses. Despite this extraordinary inter-species variation, one subset of FBX genes was conserved among most species examined. Together with evidence of strong purifying selection and expression, the ligases synthesized from these conserved loci likely direct essential ubiquitylation events. Another subset was much more lineage specific, showed more relaxed purifying selection, and was enriched in loci with little or no evidence of expression, suggesting that they either control more limited, species-specific processes or arose from genomic drift and thus may provide reservoirs for evolutionary innovation. Numerous FBX loci were also predicted to be pseudogenes with their numbers tightly correlated with the total number of FBX genes in each species. Taken together, it appears that the FBX superfamily has independently undergone substantial birth/death in many plant lineages, with its size and rapid evolution potentially reflecting a central role for ubiquitylation in driving plant fitness.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Pipeline for the comprehensive annotation of FBX genes in 18 plant species.
(A) Procedures of FBX gene identification from 18 plant genomes. The initial FBXD query collection from PFAM and SMART was used to find the FBX proteins by BLASTp searches and HMMER3 predictions of each plant proteome and then incorporated with the FBX proteins collected from the other plant and non-plant proteomes to create two comprehensive sequence collections, FBX_Query and FBX_Ref. FBX_Query was used to search each genome for FBXD regions by tBLASTn and the non-annotated regions surrounding the FBXD were re-annotated by the Closing Target Trimming (CTT) algorithm and the similarity-based annotation program GENEWISE. To avoid the bias, at least two reference sequences from FBX_Ref collection were used to predict the transcript model of each non-annotated potential FBX locus. Only when both sequences predicted the same transcript model was the coding sequence generated with the best GENEWISE score accepted as the final annotation. (B) Schematic description of the CTT algorithm. The “x” indicates a target FBXD region identified from tBLASTn search. “Hit1” indicates the top hit from a BLASTx search using the ∼10-kbp genomic DNA sequence as query against FBX_Ref. The process was iterated for up to 6 times for each non-annotated potential FBX locus.
Figure 2
Figure 2. Relationship between the number of total FBX genes, FBX pseudogenes and genome size for 18 plant species.
(A,B) The number of total FBX genes (A) and the number of FBX pseudogenes (B) were plotted against the genome size of each species. (C) The number of FBX pseudogenes was plotted against the number of total FBX genes in each species. The Spearman rank correlation coefficients (rho), the corresponding p values, and the linear model-fitted trend lines are shown.
Figure 3
Figure 3. The gain/loss analysis of FBX genes during the evolution of 18 plant species.
(A) The gain and loss of FBX genes in each species and species split nodes. The phylogeny of each species and the taxonomic group designations were adopted from the Angiosperm Phylogeny Website (http://www.mobot.org/MOBOT/research/APweb/). The heatmap color bars at each branch/node represent the predicted number of genes gained (right) or lost (left), respectively. The actual numbers of gains/losses are indicated below each bar. The full names of the species along with their abbreviations are as listed in Table 1. (B) Species-specific generation of FBX genes in each species.
Figure 4
Figure 4. The enrichment/depletion of various C-terminal substrate-recognition modules in the collection of FBX proteins from each species.
The red and green-inverted triangles indicate the significant enrichment and depletion of FBX proteins (p<0.05, Fisher's exact test), respectively, for each species as compared with the average number (indicated by a horizontal grey line) of FBX proteins containing the same module from all 18 plant species.
Figure 5
Figure 5. The divergent distribution of FBX gene numbers at different conservation levels.
(A) The numbers of total FBX genes (top panel), FBX pseudogenes (middle panel), and FBX protein-coding genes (bottom panel) from each species were plotted against the number of species represented in the FBX orthologous groups (OGs). “Orphan” indicates the genes without any orthologs. “1–18” denotes the numbers of species in an OG. STS, small taxonomic scale; LTS, large taxonomic scale. (B) The distribution of standard deviations of FBX gene numbers, total protein-coding FBX gene numbers, and total FBX pseudogenes from 17 plant species (Cr is not included) at different conservation levels. (C) The percentage of FBX pseudogenes with low (STS) or high (LTS) conservation levels in each of the 18 plant species.
Figure 6
Figure 6. The distributions of Ka/Ks values for the LTSP, STSP and FBX pseudogenes in each of 18 plant species.
The Ka/Ks value for each full-length sequence was calculated by comparing it to the MRCA transcript sequence using the method of Goldman and Yang .
Figure 7
Figure 7. Comparisons of evolutionary selections and intronization among the LTSP, STSP and FBX pseudogene loci in each species.
(A) The percentage of FBX genes under purifying selection (top panel), neutral change (middle panel), and adaptive selection (bottom panel) in each subgroup. (B) The percentage of FBX genes containing at least one intron in each subgroup.
Figure 8
Figure 8. The distributions of tandem and segmental duplicated FBX genes in each of 18 plant species.
(A) Total FBX genes. (B) STSP genes. (C) LTSP genes. (D) FBX pseudogenes.
Figure 9
Figure 9. Expression correlation test and functional predictions of A. thaliana FBX genes.
(A) The expression correlations of 395 FBX genes. The Pearson's correlation coefficients of ∼4,000 microarray datasets for each FBX gene were calculated pairwise. The dendogram at the top of the panel shows the hierarchical clustering of 395 FBX genes based on the dissimilarities of Pearson's correlation coefficients. The bar codes indicated the distributions of LTSP FBX genes (red color), STSP FBX genes (blue color), and FBX pseudogenes (green color). The horizontal histogram (magenta color) on the left shows the EST numbers for each FBX gene. The heatmap color key, indicating the Pearson's correlation coefficient values, is shown on the right top corner of the panel. The correlations of cluster-a genes are highlighted with a red box and the correlations of cluster-b genes are highlighted with a blue box. (B) Enrichment assay of expressed LTS (top panel) and STS (bottom panel) FBX genes in each of the 3,868 different microarray datasets. The black line in each panel indicates the mean numbers of expressed FBX genes. Experiments above the top or below the bottom red line in each panel represent the datasets which significantly increased or decreased overall FBX gene expression frequency, respectively (Fisher's exact test, p<0.05). The statistically significant enrichment of expressed FBX genes in a microarray dataset, which could infer the function(s) of FBX genes, and the experimental condition examined by the microarray for both expressed LTS and STS FBX genes are summarized in Table S10.

Similar articles

Cited by

References

    1. W-H Li. Evolution of duplicate genes and pseudogenes. In: Nei M, Koehm RK, editors. Evolution of genes and proteins. Sunderland, MA: Sinauer Associates, Inc; 1983. pp. 14–37.
    1. Shiu S-H, Shih M-C, Li W-H. Transcription factor families have much higher expansion rates in plants than in animals. Plant Physiol. 2005;139:18–26. - PMC - PubMed
    1. Zou C, Lehti-Shiu MD, Thibaud-Nissen F, Prakash T, Buell CR, et al. Evolutionary and expression signatures of pseudogenes in Arabidopsis and rice. Plant Physiol. 2009;151:3–15. - PMC - PubMed
    1. Nei M, Niimura Y, Nozawa M. The evolution of animal chemosensory receptor gene repertoires: roles of chance and necessity. Nat Rev Genet. 2008;9:951–963. - PubMed
    1. Nei M, Gu X, Sitnikova T. Evolution by the birth-and-death process in multigene families of the vertebrate immune system. Proc Natl Acad Sci USA. 1997;94:7799–7806. - PMC - PubMed

Publication types