The products of the trithorax gene are required to stably maintain homeotic gene expression patterns established during embryo-genesis by the action of the transiently expressed segmentation genes. We have determined the intron/exon structure of the trx gene and the large alternatively spliced trx RNAs, which are capable of encoding only two protein isoforms. These very large trx proteins differ only in a long Ser- and Gly-rich N-terminal extension, encoded by exon II, which is present only in the larger trx isoform. We have identified a novel variant of the highly conserved nuclear receptor type of DNA binding domain. We have found that the previously identified Cys-rich central region contains multiple novel zinc finger motifs which are also present in the Polycomb-like protein and RBP2, a retinoblastoma binding protein. The trx proteins terminate with another novel conserved domain which we have identified in proteins from three kingdoms, including plants and fungi, indicating that has an ancient origin. Many of these proteins are chromosomally associated, suggesting that this domain may be involved in interactions between trx and other highly conserved components of chromatin involved in transcription regulation. The sequence alterations of trx mutations identify the highly conserved regions of trx as critical for the function of these large proteins. We show that zygotically expressed trx RNAs encoding the larger protein isoform are initially expressed in a spatially restricted pattern which overlaps the expression domains of the BX-C genes Ubx, abd-A and Abd-B. This pattern is transient and evolves into a broader expression domain encompassing the entire germ band during the extended germ band stage.