Genomic clones containing portions of the human cathepsin E (CTSE) gene were isolated from cosmid and lambda recombinant libraries. The regions corresponding to coding, the 5'- and 3'-untranslated, and the exon-intron boundaries of the CTSE gene were identified by sequence and hybridization analysis. The size and placement of the nine exons found in the 17.5-kilobase CTSE gene was highly conserved relative to other aspartic proteinases and provided additional evidence that these proteinases are derived from a common ancestral gene. Segregation and linkage analysis of two informative restriction fragment length polymorphisms (MspI and DraI) indicated that there is a single human CTSE locus located at chromosome 1q31-q32 which is closely linked to the renin gene. Three CTSE transcripts (3.6, 2.6, and 2.1 kilobases) were identified in gastric fundic and antral mucosa poly (A+) RNA, and these appeared identical in size and relative abundance to those contained in poly(A+) RNA from cultured gastric adenocarcinoma cell lines containing CTSE. Sequence analysis of cDNA clones and comparison with the 3'-flanking untranslated region in genomic clones provided evidence that alternative polyadenylation of the primary transcript resulted in the 2.6- and 2.1-kilobase transcripts which constituted greater than 95% of CTSE transcripts found in the stomach.