Background: HMG-box proteins are a large and diverse superfamily of architectural factors that share one or more copies of a sequence- and structurally-related DNA binding domain. These proteins can modify chromatin structure by bending and unwinding DNA. HMG-box proteins can be divided into two subfamilies based on whether they recognize DNA in a sequence-dependent or sequence-independent manner. We recently identified an HMG-box protein involved in T cell development, designated TOX, which is highly conserved in humans and mice.
Results: We show here that based on sequence alignment, TOX best fits into the sequence-independent HMG-box family. Three other human and murine predicted proteins are identified that share a common HMG-box domain with TOX, as well as other features. The gene encoding one of these additional family members has a distinct but overlapping pattern of tissue expression when compared to TOX. In addition, we identify genes encoding predicted TOX HMG-box subfamily members in pufferfish and mosquito.
Conclusions: We have identified a novel subfamily of HMG-box proteins that is related to the recently described TOX protein. The highly conserved nature of the TOX family of proteins in humans and mice and differences in the pattern of expression between family members suggest non-overlapping functions of individual proteins. In addition, our data suggest that the TOX subtype of HMG-box domain first appeared in invertebrates, was duplicated in early vertebrates and likely took on new functions in mammalian species.