Two rounds of large-scale duplications are thought to have occurred in early vertebrate ancestry; this is now known as the "2R hypothesis." They have led to the constitution of subfamilies of paralogous genes. Chromosomal regions that contain present-day paralogs (paralogous regions or paralogons) have been identified in mammals. We show that sets of paralogons (PGs) can be assembled in a tentative "human genome paralogy map" that includes all autosomes and X. A total of 14 PGs, containing more than 1600 genes, were assembled in this paralogy map. Genes that belong to the same PG are coparalogs. We show that identification of coparalogy can be used (i) to broaden data on gene mapping, (ii) to identify physical gene clusters that derive from early cis-duplications, and (iii) to speculate on coevolution and coregulation of genes sharing a common structure or function (functional clusters). Thus, coparalogy analyses should parallel phylogenetic analyses and can help draw hypotheses on gene and genome evolution.
Copyright 2001 Academic Press.