We report the sequence of the entire human gene encoding beta-glucocerebrosidase and that of the associated pseudogene. The gene contains 11 exons extending from base pair 355 to base pair 7232 in the overall sequence. The gene promoter contains TATA- and CAT-like boxes upstream of the major 5' end of the glucocerebrosidase RNA. The two TATA boxes lie between nucleotides (-23)-(-27) and (-33)-(-39) and the two possible CAT boxes reside between nucleotides (-90)-(-94) and (-96)-(-99) in relation to the major 5' end of the mRNA. The functionality of the promoter region was monitored by coupling it to the bacterial gene coding for chloramphenicol acetyltransferase (CAT) and assaying the expression of the enzyme in cells transfected with this vector. The glucocerebrosidase promoter not only directs synthesis of the bacterial enzyme but also exhibits the same pattern of tissue-specific expression as that of the endogenous gene. An apparently tightly linked pseudogene is approximately 96% homologous to the functional gene. However, introns 2, 4, 6, and 7 have large "deletions" consisting of Alu sequences 313, 626, 320, and 277 bp in length, respectively. It is entirely possible that the ancestral gene lacks these sequences and that they have been inserted into the introns of the functioning gene. There is also a 55-bp deletion from a part of exon 9 flanked by a short inverted repeat. The sequence data should facilitate development of methods for diagnosis of Gaucher disease at the molecular level.