By searching the human genome sequence database with human hGSTA1 and hGSTA4 cDNA sequences, we identified three PAC and one BAC clones covering more than 400 kilobases and containing the entire GST alpha gene cluster. The cluster consists of five genes: hGSTA1, hGSTA2, hGSTA3, hGSTA4 and hGSTA5, and seven pseudogenes that are distinguished as such by single-base and/or complete exon deletions. Using gene-specific probes we demonstrated that hGSTA1, hGSTA2 and hGSTA4 mRNAs are widely expressed in human tissues, whereas hGSTA3 mRNA appears to be a rare message subject to splicing defects. Although examination of the hGSTA5 gene sequence suggests that it is a functional gene, hGSTA5 mRNA could not be detected in human tissues we studied. hGSTA1 expression has been shown to be influenced by a genetic polymorphism, that consists of two alleles hGSTA1*A and hGSTA1*B, containing three linked base substitutions in the proximal promoter, at positions -567, -69 and -52. Constructs consisting of the luciferase gene controlled by variant hGSTA1 promoters showed differential expression when transfected into HepG2, GLC4 and Caco-2 cells: hGSTA1*A > hGSTA1*B. Directed mutagenesis for each base substitution indicated that the base change -52G>A was responsible for the differential promoter activity of hGSTA1*A and hGSTA1*B. The base at position -52 also altered binding of the ubiquitous transcription factor Sp1, as determined by gel shift analysis. Thus it may be postulated that hGSTA1 genotyping will be of importance to determine individual susceptibility to certain cancers or the efficacy of chemotherapeutics via its effect on hGSTA1 expression.