The cytoplasmic beta-catenin protein is implicated in signal transduction and associates with both the cell-cell adhesion protein E-cadherin and the tumor suppressor gene product APC. We determined the primary structure of the human beta-catenin gene (CTNNB1) by analysis of cDNA and genomic clones. The size of the complete gene was determined to be 23.2 kb. Restriction mapping and partial sequence analysis revealed 16 exons. All splice donor and acceptor sites were conformable to the GT/AG rule. The exon size ranged from 61 to 790 bp. Half of the introns were smaller than 550 bp, with the smallest being 84 bp and the longest being 6700 bp. The intron-exon boundaries did not coincide either with conserved sites in the 12 armadillo repeat sequences of beta-catenin or with intron-exon boundaries in the armadillo gene of Drosophila. A major site for transcription initiation was identified as an A residue 214 nucleotides upstream of the ATG initiation codon. The resulting transcript is 3362 nucleotides long. Compared to the previously published mRNA sequence, additional residues were identified, 16 at the 5' end and 766 at the 3' end of the mRNA. An alternative splice acceptor site within exon 16 reduced the 3' UTR sequence by 159 bp. Polymerase chain reaction on cDNA from 14 human cell lines demonstrated the general occurrence of both splice variants. The 5'-flanking region is highly GC-rich and lacks a CCAAT box, but contains a TATA box and potential binding sites for several transcription factors, such as NF kappa B, SP1, AP2, and EGR1. Both a 437-bp fragment and a 6-kb fragment, containing about 4.7 kb of the 5'-flanking region in addition to the noncoding exon 1 and 1 kb of intron 1, showed clear promoter activity when these fragments were linked to a secreted alkaline phosphatase reporter gene and transfected into a mouse epithelial cell line.