Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 1, 409-14

Evolutionary Persistence of Functional Compensation by Duplicate Genes in Arabidopsis

Affiliations

Evolutionary Persistence of Functional Compensation by Duplicate Genes in Arabidopsis

Kousuke Hanada et al. Genome Biol Evol.

Abstract

Knocking out a gene from a genome often causes no phenotypic effect. This phenomenon has been explained in part by the existence of duplicate genes. However, it was found that in mouse knockout data duplicate genes are as essential as singleton genes. Here, we study whether it is also true for the knockout data in Arabidopsis. From the knockout data in Arabidopsis thaliana obtained in our study and in the literature, we find that duplicate genes show a significantly lower proportion of knockout effects than singleton genes. Because the persistence of duplicate genes in evolution tends to be dependent on their phenotypic effect, we compared the ages of duplicate genes whose knockout mutants showed less severe phenotypic effects with those with more severe effects. Interestingly, the latter group of genes tends to be more anciently duplicated than the former group of genes. Moreover, using multiple-gene knockout data, we find that functional compensation by duplicate genes for a more severe phenotypic effect tends to be preserved by natural selection for a longer time than that for a less severe effect. Taken together, we conclude that duplicate genes contribute to genetic robustness mainly by preserving compensation for severe phenotypic effects in A. thaliana.

Keywords: Arabidopsis thaliana; duplicate; functional compensation; phenotypic effect; selection pressure and genetic robustness.

Figures

F<sc>IG</sc>. 1.—
FIG. 1.—
Sequence divergence in genes with and without phenotypic effects in single-gene knockout. Sequence divergences are in terms of the p-distance (proportion of amino acid differences) and the synonymous distance (Ks) between a knockout gene and the closest paralog. The three data sets (O, L, and N) used are 1) with phenotypic changes observed in our study (O), 2) with phenotypic changes reported in the literatures (L), and 3) no phenotypic changes observed in our study (N). A. The p-distance in the entire data. B. The p-distance in the data without tandem duplicates. C. The Ks in the entire data. D. The Ks in the data without tandem duplicates. The distributions of p-distance and Ks values are shown as box plots with the thick solid horizontal line indicating the median value, the box representing the interquartile range (25–75%), and the dotted lines indicating the first to the 99th percentile. The p-distance and Ks are compared between genes with and without phenotypic effects. Genes with phenotypic effects have a significantly higher p-distance and Ks to the closest paralog than genes without any phenotypic effects (P < 0.01 in p-distance, P < 0.01 in Ks). See Table S2 (Supplementary Material online) for P values.
F<sc>IG</sc>. 2.—
FIG. 2.—
Relationship between gene family and phenotypic effect. Based on Arabidopsis gene families generated by the Markov Clustering algorithm, 10,000 pairs of genes are randomly chosen within a gene family (A) and between gene families (B). The ratio between the number of gene pairs with the same kind of phenotypes (seed, reproductive, vegetative, or conditional) and the number of gene pairs with different kind of phenotypes is significantly higher between genes in the same gene family than between genes in different gene families (P < 1 × 10−15).
F<sc>IG</sc>. 3.—
FIG. 3.—
Relationship between sequence divergence and phenotypic effect. A. Relationship between the protein distance (p-distance) between a knockout gene and its closest paralog and phenotypic effect when the entire data were used. B. Relationship between the protein distance (p-distance) between a knockout gene and its closest paralog and phenotypic effect when the data without tandem duplicates were used. C. Relationship between the synonymous distance (Ks) between a knockout gene and its closest paralog and phenotypic effect when the entire data were used. D. Relationship between the synonymous distance (Ks) between a knockout gene and its closest paralog and phenotypic effect when the data without tandem duplicates were used. Phenotypic changes are classified into seed (S), reproductive (R), vegetative (V), conditional (C), or no (No) phenotypes. The distributions of p-distance and Ks values are shown as box plots with the thick solid horizontal line indicating the median value, the box representing the interquartile range (25–75%), and the dotted lines indicating the first to the 99th percentile. It appears that the order of phenotypic effect from the highest to the lowest significance is seed > reproduction > vegetative > conditional > no effect; see text for the ranking. Below each figure, significant differences are shown for each pair of phenotypes by the Wilcoxon test. “X > Y” means that X is significantly greater than Y at P = 0.05 and “XY” means that X is not different from Y at the 5% level of significance.
F<sc>IG</sc>. 4.—
FIG. 4.—
Selection pressures (Ka/Ks ratio) on paralogous genes whose multiple-knockout mutants show phenotypic changes, but single-knockout mutants do not selection pressures on paralogous genes whose multiple-knockout mutants show phenotypic changes in seed (S), vegetative (V), reproductive (R), or conditional (C) phenotypes are inferred by the ratio of the nonsynonymous substitution rate (Ka) to the synonymous substitution rate (Ks). The distributions of Ka/Ks ratios are shown as box plots with the thick solid horizontal line indicating the median value, the box representing the interquartile range (25–75%), and the dotted lines indicating the first to the 99th percentile. Significant differences are shown for each pair of phenotypes by the Wilcoxon test. “X > Y” means that X is significantly greater than Y at P = 0.05 and “XY” means that X is not different from Y at the 5% level of significance.

Similar articles

See all similar articles

Cited by 28 articles

See all "Cited by" articles

References

    1. Altschul SF, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. - PMC - PubMed
    1. Briggs GC, Osmont KS, Shindo C, Sibout R, Hardtke CS. Unequal genetic redundancies in Arabidopsis—a neglected phenomenon? Trends Plant Sci. 2006;11:492–498. - PubMed
    1. Conant GC, Wagner A. Duplicate genes and robustness to transient gene knock-downs in Caenorhabditis elegans. Proc Biol Sci. 2004;271:89–96. - PMC - PubMed
    1. DeLuna A, et al. Exposing the fitness contribution of duplicated genes. Nat Genet. 2008;40:676–681. - PubMed
    1. Gao LZ, Innan H. Very low gene duplication rate in the yeast genome. Science. 2004;306:1367–1370. - PubMed
Feedback