For the isolation of cDNA clones encoding the carcinoembryonic antigen (CEA), we have constructed a cDNA library from human colon tumor mRNA. The library was screened with various oligonucleotides whose sequence had been deduced from partial amino acid sequence data for CEA. Positive candidate clones were hybridized with a probe for repetitive DNA, because CEA mRNA contains an Alu repetitive element, and with a fragment of a genomic clone of nonspecific cross-reacting antigen, an antigen closely related to CEA. Here we report the nucleotide sequence of the two overlapping CEA cDNA clones comprising 1422 nucleotides of CEA mRNA. This sequence encodes the 372 COOH-terminal amino acids of CEA followed by 305 nucleotides of 3' untranslated sequence containing a truncated Alu repeat. The predicted protein sequence is composed of two repeats comprising 178 amino acids, each with an exceptionally high homology of 67%. Each repeat unit contains four conserved cysteine residues and six to nine putative N-glycosylation sites. CEA mRNA is most strongly expressed in primary colon tumors and, to a lesser extent, in normal colonic tissue. No CEA mRNA is found in HeLa cells and normal human fibroblasts.