A 2-oxoglutarate-dependent dioxygenase (EC 126.96.36.199) which catalyzes the 4-hydroxylation of desacetoxyvindoline was purified to homogeneity. Three oligopeptides isolated from a tryptic digest of the purified protein were microsequenced and one oligopeptide showed significant homology to hyoscyamine 6 beta-hydroxylase from Hyoscyamus niger. A 36-mer degenerate oligonucleotide based on this peptide sequence was used to screen a Catharanthus roseus cDNA library and three clones, cD4H-1 to -3, were isolated. Although none of the three clones were full-length, the open reading frame on each clone encoded a putative protein containing the sequence of all three peptides. Primer extension analysis suggested that cD4H-3, the longest cDNA clone, was missing 156 bp at the 5' end of the clone and sequencing of the genomic clone, gD4H-8, confirmed these results. Southern blot analysis suggested that d4h is present as a single-copy gene in C. roseus which is a diploid plant, and the significant differences in the sequence of the 3'-UTR between cD4H-1 and -3 suggest that they represent dimorphic alleles of the same hydroxylase. The identity of the clone was further confirmed when extracts of transformed Escherichia coli expressed D4H enzyme activity. The D4H clone encoded a putative protein of 401 amino acids with a calculated molecular mass of 45.5 kDa and the amino acid sequence showed a high degree of similarity with those of a growing family of 2-oxoglutarate-dependent dioxygenases of plant and fungal origin. The similarity was not restricted to the dioxygenase protein sequences but was also extended to the gene structure and organization since the 205 and 1720 bp introns of d4h were inserted around the same highly conserved amino acid consensus sequences as those for e8 protein, hyoscyamine-6 beta-hydroxylase and ethylene-forming enzyme. These results provide further support that a common ancestral gene is responsible for the appearance of this family of dioxygenases. Hydroxylase assays and RNA blot hybridization studies showed that enzyme activity followed closely the levels of d4h transcripts, occurring predominantly in young leaves and in much lower levels in stems and fruits. In contrast, etiolated seedlings which contained considerable levels of d4h transcripts had almost undetectable hydroxylase activity, whereas exposure of seedlings to light resulted in a rapid increase of enzyme activity without a significant further increase in d4h transcripts over those detected in dark-grown seedlings. These results suggest that the activating effect of light may occur at a point downstream of transcription which remains to be elucidated.