Gene family size variation is an important mechanism that shapes the natural variation for adaptation in various species. Despite its importance, the pattern of gene family size variation in green plants is still not well understood. In particular, the evolutionary pattern of genes and gene families remains unknown in the model plant Arabidopsis thaliana in the context of green plants. In this study, eight representative genomes of green plants are sampled to study gene family evolution and characterize the origination of A. thaliana genes, respectively. Four important insights gained are that: (i) the rate of gene gains and losses is about 0.001359 per gene every million years, similar to the rate in yeast, Drosophila, and mammals; (ii) some gene families evolved rapidly with extreme expansions or contractions, and 2745 gene families present in all the eight species represent the 'core' proteome of green plants; (iii) 70% of A. thaliana genes could be traced back to 450 million years ago; and (iv) intriguingly, A. thaliana genes with early origination are under stronger purifying selection and more conserved. In summary, the present study provides genome-wide insights into evolutionary history and mechanisms of genes and gene families in green plants and especially in A. thaliana.
© 2012 The Author The Plant Journal © 2012 Blackwell Publishing Ltd.