Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2003 Jun;13(6A):1097-110.
doi: 10.1101/gr.963903. Epub 2003 May 12.

Differential Expansion of Zinc-Finger Transcription Factor Loci in Homologous Human and Mouse Gene Clusters

Affiliations
Free PMC article
Comparative Study

Differential Expansion of Zinc-Finger Transcription Factor Loci in Homologous Human and Mouse Gene Clusters

Mark Shannon et al. Genome Res. .
Free PMC article

Abstract

Mammalian genomes carry hundreds of Krüppel-type zinc finger (ZNF) genes, most of which reside in familial clusters. ZNF genes encoding Krüppel-associated box (KRAB) motifs are especially prone to this type of tandem organization. Despite their prevalence, little is known about the functions or evolutionary histories of these clustered gene families. Here we describe a homologous pair of human and mouse KRAB-ZNF gene clusters containing 21 human and 10 mouse genes, respectively. Evolutionary analysis uncovered only three pairs of putative orthologs and two cases where a single gene in one species is related to multiple genes in the other; several human genes have no obvious homolog in mouse. We deduce that duplication and loss of ancestral cluster members occurred independently in the primate and rodent lineages after divergence, yielding substantially different ZNF gene repertoires in humans and mice. Differences in expression patterns and sequence divergence within the DNA binding regions of predicted proteins suggest that the duplicated genes have acquired novel functions over evolutionary time. Since KRAB-ZNF proteins are predicted to function as transcriptional regulators, the elaboration of new lineage-specific genes in this and other clustered ZNF families is likely to have had a significant impact on species-specific aspects of biology.

Figures

Figure 1
Figure 1
Maps of the homologous human and mouse ZNF gene family regions. Positions of KRAB-A- (vertical lines) and ZNF-encoding exons (boxes) that comprise gene models for 21 human and 10 mouse ZNF genes, as designated by arrows over the corresponding exons indicating transcriptional direction, are drawn above maps of relevant portions of HSA19q13.2 sequence contig NT_011109.13 (top) and Mmu7 contig NT_039407.1 (bottom). Locations of flanking markers KCNN4 and human LOC125931 (mouse LOC330484) are also shown. Numbers below each map correspond to nucleotide positions in the sequenced contig. The regions surrounding mouse and human gene clusters are inverted in telomeric-centromeric orientation due to an ancient chromosome rearrangement event, as indicated by symbols “tel” and “cen” above each map. To align homologous genes, we therefore display the reverse complement of the mouse contig sequence (as indicated by arrow below the mouse map). Conflicts between the NT_039407.1 sequence assembly and cDNA sequences were resolved by examining draft sequence from two overlapping BACs, the approximate extent of which is illustrated by lines drawn at the bottom of the mouse map. Associated numbers correspond to GenBank accession numbers for the BAC sequences. In addition to 10 complete genes, the mouse region contains an isolated ZNF-like segment without a significant ORF (filled box, at positions 407113–407607 at the 3′ end of Zfp61) and a single isolated KRAB-A sequence that is not associated with a known gene (positions 442889–443028 on NT_039407.1, indicated by an asterisk).
Figure 2
Figure 2
Sequence alignment of the predicted KRAB domains encoded by members of the (A) HSA19q13.2 and (B) Mmu7 ZNF gene families. Consensus sequences are shown below each set of human and mouse sequences. In the consensus sequence, amino acids that are conserved in all sequences are denoted by capitalized symbols; others are shown in lowercase letters. Dashes indicate that either KRAB-B domain sequences were not found in corresponding cDNAs or were not predicted from genomic sequence.
Figure 3
Figure 3
Comparison of the ZNF repeat regions encoded by members of the (A) HSA19q13.2 and (B) Mmu7 ZNF gene families, showing the variation in number of finger motifs between genes, including closely related sets. Black boxes indicate typical C2H2-type ZNF repeats, and striped boxes denote clearly degenerate repeats, defined as any that lack one or more of the key cysteine or histidine residues, or has a variation in spacing between critical amino acid positions that might affect finger structure. Several of these genes have additional possible degenerate repeats in the `spacer' part of the spacer+fingers exon (the spacer would be at the 5′ end of the region illustrated); these range from obvious former fingers to barely recognizable potential remnants (only the degenerate fingers shown here were included in the alignments). LOC147711 has a 1-bp deletion that frameshifts the translation of the 3′ fingers which would otherwise be comparable to those of ZNF285.
Figure 4
Figure 4
Predicted evolutionary relationships between proteins encoded by the homologous HSA19q13.2 and Mmu7 ZNF gene families, based on neighbor-joining analysis on (A) amino-acid sequences and (B) nucleotide sequences. Single best trees are shown, and bootstrap values above 70% (based on 1000 bootstrap replicates) have been added above the branches. In a few cases in both trees, bootstrap values were placed in smaller font below shorter branches for legibility. The ZNF repeat regions of the 21 human and 10 mouse family members were aligned using the ClustalW 1.8 program. The evolutionary relationships between the amino acid and nucleotide sequence sets were predicted using the PAUP program (see Methods). Bars on the right indicate five clades including both human and mouse genes that are also well supported by parsimony and maximum-likelihood results; there are three pairs of potential orthologs, whereas Groups I and IV contain a single gene from one species and multiple related genes from the other. Other clades were left unmarked if they only included human genes or were not consistently well supported. Some of the more ancient relationships between groups are not as well resolved in parsimony and ML analyses.
Figure 5
Figure 5
(A) Comparison of predicted proteins encoded by ZNF235, Zfp235, and five other mouse genes in Group IV. Entire proteins were aligned to maximize amino acid sequence identities; the order in which the genes are listed here is not meant to indicate a hypothesis of a linear series of duplication events. Boxes indicate KRAB-A and -B domains and ZNF repeats schematically. Diagonally-striped boxes are degenerate finger repeats. Some of the functional finger repeats are filled in shades of gray to help diagrammatically indicate chosen fingers that are present in most proteins but are absent in another. The sequence of Zfp111 contains evidence for at least one internal repeat-duplication event (underline) and possibly a remnant of a second (dot on suspected duplicated finger); three of the fingers duplicated in Zfp111 (light gray) are deleted in Zfp114. The numerical values over subregions of the predicted proteins indicate the amino acid sequence identity between ZNF235(top) and proposed homologous regions of each of the six mouse proteins, omitting the sections that are absent in either protein. (B) Comparison of the Zfp61, ZNF226, and ZNF234 proteins. Predicted complete proteins were aligned to maximize amino acid sequence identities. KRAB-A and -B domains and ZNF repeats are indicated by boxes (with diagonally striped boxes for degenerate fingers as above). Numerical values over subregions of the proteins indicate the amino acid sequence identity between Zfp61 and ZNF226 or Zfp61 and ZNF234; therefore there is no value for the block of fingers shared by the two human proteins but absent (probably due to a deletion event) in Zfp61 (*).
Figure 6
Figure 6
Organization of predicted orthologs and paralogs in the human and mouse maps. The 700-kb region encompassing the human ZNF gene family is represented at top, with the physical map of the related 300-kb Mmu7 ZNF gene family illustrated below it. The gene names are color-coded to highlight evolutionary relationships within and between maps, according to data presented in Fig. 4; Group I is blue, Group II is red, Group III is orange, Group IV is green, and Group V is purple. Colored lines connecting genes in the two families indicate putative pairs of orthologs, or sets where a single gene in one species belongs to a clade with an expanded group of multiple genes in the other species. The relationships of the human genes in black are not as well resolved (taking into account parsimony and ML results and high divergence), and therefore they are not connected on this diagram except for two closely related pairs of duplicates indicated by gray lines. Arrows indicate the approximate positions of genes; the two black triangles represent two gene fragments (an isolated KRAB-A box and a segment of DNA containing several degenerate fingers) as detailed in Fig. 1.
Figure 7
Figure 7
Expression of HSA19q13.2 ZNF gene family members in human tissues. Northern blots of poly (A)+ RNA from whole tissues was hybridized to gene-specific probes for each family member. The gene names are indicated at the left and are grouped according to evolutionary relationships between the loci. Approximate molecular weights of transcripts are indicated in the middle of each panel. The blots used for these studies were also rehybridized to a probe for human β-actin, with a representative set of results shown in the bottom panel. *: The Northern blots shown hybridized with the ZNF404 and ZNF283 probes were identical to the others except for the substitution of a uterus RNA sample (for ZNF404 and ZNF283) instead of ovary RNA (all others) in the lane indicated.

Similar articles

See all similar articles

Cited by 45 articles

See all "Cited by" articles

Publication types

Associated data

LinkOut - more resources

Feedback