Background: Hox and the closely-related ParaHox genes, which emerged prior to the divergence between cnidarians and bilaterians, are the most well-known members of the ancient genetic toolkit that controls embryonic development across all metazoans. Fundamental questions relative to their origin and evolutionary relationships remain however unresolved. We investigate here the evolution of metazoan Hox and ParaHox genes using the HoxPred program that allows the identification of Hox genes without the need of phylogenetic tree reconstructions.
Results: We show that HoxPred provides an efficient and accurate classification of Hox and ParaHox genes in their respective homology groups, including Hox paralogous groups (PGs). We analyzed more than 10,000 sequences from 310 metazoan species, from 6 genome projects and the complete UniProtKB database. The HoxPred program and all results arranged in the Datab'Hox database are freely available at http://cege.vub.ac.be/hoxpred/. Results for the genome-scale studies are coherent with previous studies, and also brings knowledge on the Hox repertoire and clusters for newly-sequenced species. The unprecedented scale of this study and the use of a non-tree-based approach allows unresolved key questions about Hox and ParaHox genes evolution to be addressed.
Conclusions: Our analysis suggests that the presence of a single type of Posterior Hox genes (PG9-like) is ancestral to bilaterians, and that new Posterior PGs would have arisen in deuterostomes through independent gene duplications. Four types of Central genes would also be ancestral to bilaterians, with two of them, PG6- and PG7-like that gave rise, in protostomes, to the UbdA- and ftz/Antp/Lox5-type genes, respectively. A fifth type of Central genes (PG8) would have emerged in the vertebrate lineage. Our results also suggest the presence of Anterior (PG1 and PG3), Central and Posterior Hox genes in the cnidarians, supporting an ancestral four-gene Hox cluster. In addition, our data support the relationship of the bilaterian ParaHox genes Gsx and Xlox with PG3, and Cdx with the Central genes. Our study therefore indicates three possible models for the origin of Hox and ParaHox in early metazoans, a two-gene (Anterior/PG3--Central/Posterior), a three-gene (Anterior/PG1, Anterior/PG3 and Central/Posterior), or a four-gene (Anterior/PG1--Anterior/PG3--Central--Posterior) ProtoHox cluster.