Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
, 14 (4), 580-90

Genomic Analysis of the Nuclear Receptor Family: New Insights Into Structure, Regulation, and Evolution From the Rat Genome

Affiliations
Comparative Study

Genomic Analysis of the Nuclear Receptor Family: New Insights Into Structure, Regulation, and Evolution From the Rat Genome

Zhengdong Zhang et al. Genome Res.

Abstract

Completion of the Rattus norvegicus genome sequence enabled a global inventory and analysis of the nuclear receptors (NRs) in three mammalian species. Forty-nine NR members were found in mouse, 48 in human. Forty-seven were found in the rat, with gaps at the locations expected for the other two. Pairwise comparisons of their distribution in rat, mouse, and human identified 11 syntenic NR gene blocks, including three small clusters of two or three closely related genes, each spanning 40 kb to 1700 kb. The exon structure of the ligand-binding domain suggests that exon shuffling has played a role in the evolution of this family. An invariant splice junction in all members of the NR family except LXRbeta suggests a functional role for the intron. The ligand-binding domains of PXR and CAR are among the most divergent in the family. Their higher nucleotide substitution rates may be related to the central role played by these two NRs in the metabolism of the foreign compounds and may have resulted from limited positive selection.

Figures

Figure 1
Figure 1
The chromosomal landscape of rat nuclear receptor genes. NR genes on the forward strand were placed on the right of the chromosomes, and NR genes on the reverse strand were placed on the left. NR1D2, NR2E3, and the sequences encoding the LBDs of NR1B2 and NR2E1 are missing due to sequence gaps in the current rat genome assembly. Their genomic locations are indicated in the square brackets (L for LBD). The syntenic blocks containing NR genes are highlighted in green (see also Table 3).
Figure 2
Figure 2
Chromosomal location of related NR gene clusters i and ii. Genes are labeled on each figure, and closely related paralogs are similarly shaded. Coordinate positions for each chromosome are indicated below the number lines. Gene arrow lengths are proportional to the size of each gene. (A) Cluster i spans ∼270 kb. The inset gives a scale drawing of the relationship of the NR1A1 and 1D1 genes. The three known variants of NR1A1 are shown. Coding exons are shaded boxes, 3′ UTRs are open. A filled inverted triangle marks the splice acceptor of the invariant LBD splice junction (see also Fig. 3B). (B) Cluster ii, ∼1.4 Mb. The rat gene for NR1D2 is only presumed to exist at the indicated position. Sequences for this gene are absent from the assembly, and a gap exists at this position (indicated by the broken line). The rat NR1B2 is a partial gene, containing a DBD but not an LBD, most likely as a result of incomplete assembly of this draft genome. Note that the 1A and 1D genes are on opposite strands in each cluster, in the same orientation relative to each other; their order changes relative to the 1B gene.
Figure 3
Figure 3
Unrooted phylogenetic trees of the NR family. The same color scheme for NR subfamilies is used as in Fig. 1. Group-level designations (e.g., 0B, 1A, 1B,..., 6A) label the interior branches, but common gene names label the terminal branches. Bootstrap values expressed in percentage are indicated at the nodes (branch bifurcations). (A) A complete tree constructed from the multiple sequence alignment of the LBDs of all NRs found in rat, mouse, and human. Shading highlights groups exhibiting rapid evolution. (B) NR2 subfamily clade (orphan receptors) taken from the DBD tree. Shading highlights a group exhibiting increased conservation. (C) Portion of DBD tree showing the relationship between subfamilies NR4 and NR5.
Figure 4
Figure 4
Variable sites in the LBD of PXR. (A) The variable sites, highlighted in gray, in the protein sequence alignment of the LBDs of the human, mouse, and rat PXRs. The corresponding secondary structure is indicated below the sequence alignment: the α-helix is represented by the cylinder, and the β-sheet by the parallelogram. The seven variable sites at which the amino acid residues line the ligand-binding pocket and the ones found in the α-helix 9 and its vicinity are boxed with the solid line and broken line, respectively. Other available LBD sequences of the rhesus, pig, rabbit, dog, chicken, and zebrafish PXRs are omitted from the sequence alignment presented here, because the inclusion of them does not introduce changes to the general variation pattern. (B) The same sites, highlighted in yellow, in the tertiary structure of the LBD of the human PXR (the blue solid ribbon). The agonist is shown as the small red molecular structure bound in the receptor's ligand-binding pocket, and the coactivator in fragment is shown as the green solid ribbon. Details of the variable sites, with the side chains of the amino acid residues at these sites shown in yellow, in the ligand-binding pocket (C) and in the α-helix 9 and its vicinity (D) are shown.
Figure 5
Figure 5
The gene structure encoding the DBD and LBD domains of the NR genes. Open bars are exons, drawn to scale; line segments, drawn at fixed length, give intron locations. (A) DBD splice junctions. Sequences are 75–78 aa in length. The shaded boxes indicate the location of the two C4 zinc finger motifs within this highly conserved domain. Introns may be found at seven different locations in the DBD across the entire family, or may be absent. Vertical hash marks indicate the location of junctions that were shared in the following groupings: a, NR2B, NR2C1,2; b, NR1A,1B,1C,1D,1F,1H4-5, NR5A; c, NR1I, NR4A; d, NR3A,3B,3C; e, NR2E3; f, NR2A, NR2F6; g, NR2E1; and not shown are group h, NR1H, NR2F, and NR6A, which have no intron in the DBD. (B) LBD splice junctions. Sequences are 170 (NR1D2) to 208 (NR0B1) aa in length. Each row is a schematic drawing giving the relative location of the splice junction and the group of NRs sharing the splice junction pattern. The position of splice junctions in orthologs was always the same, and thus species designations are omitted. Two conserved motifs (I and II, see text) in the LBDs are shown as the hatched areas. The location of a highly conserved negatively charged amino acid residue (aspartic acid or glutamic acid) in motif II is marked by an inverted triangle. The four regions within which introns were found are indicated by slash marks: “\” in motif 1, “|” intermotif region, no slash in motif II, and “/” after motif II (see text). (C) The consensus sequences of motifs I and II. The secondary structure of the corresponding part of the LBD, derived from crystallographic studies, is indicated below the sequence. Letters in bold correspond to the residues of the NR signature, involved in stabilizing the canonical fold of the NR LBDs (see Wurtz et al. 1996).

Similar articles

See all similar articles

Cited by 51 PubMed Central articles

See all "Cited by" articles

Publication types

MeSH terms

LinkOut - more resources

Feedback