SET-domain (SET: Su(var)3-9, E(z) and Trithorax)-containing proteins were collected through sequence searches of the available databases. After removing redundancies, the proteins belonging to three families, SU(VAR)3-9, E(Z) and Trithorax, were selected. Analysis of the relationship between the different members is based on pairwise alignment, compilation, and comparison of their SET-domains. The level of homology of the SET-domains defined the distribution of the proteins into families and into clades within the families. The architecture of the entire protein supported the distribution pattern built upon SET-domain similarity. Parallel cladistic and protein-architecture analyses outlined two plausible criteria for predicting function.