The differentiation of the predominant cell types of the mucosal epithelium of the mammalian gastrointestinal tract is characterized by increasing amounts of an intermediate-sized filament (IF) protein designated cytokeratin (CK) 20 which is a major cellular protein of mature enterocytes and goblet cells. Here we report the isolation of the human gene encoding CK 20, its complete nucleotide sequence and the amino acid sequence deduced therefrom that identifies this polypeptide (mol. wt. 48553) as a member of the type I-CK subfamily. Remarkable, however, is the comparably great sequence divergence of CK 20 from all other known type I-CKs, with only 58% identical amino acids in the conserved alpha-helical 'rod' domain of CK 20 and, e.g. CK 14. Using riboprobes corresponding to exon 6 of the gene in Northern blot and ribonuclease protection assays, we show that the approximately 1.75 kb mRNA encoding CK 20 is specifically produced in cells of the intestinal and gastric mucosa, including tumors and cell lines derived therefrom. The appearance of CK 20-positive cells in human embryonic and fetal development and in adult tissues has been studied using immunohistochemistry with CK 20-specific antibodies. CK 20 synthesis has first been recognized at embryonic week 8 in individual 'converted' simple epithelial cells of the developing intestinal mucosa. In later fetal stages, CK 20 synthesis extends over most goblet cells and a variable number of villus enterocytes. The distribution of CK 20-positive cells in the developing gastric and intestinal mucosa is similar to--but not identical with--the pattern in the adult intestine in which all enterocytes and goblet cells as well as certain 'low-differentiated' columnar cells contain CK 20, whereas the neuroendocrine ('enterochromaffin') and Paneth cells are negative. In gastrointestinal carcinomas similarly examined, CK 20 has been detected in almost all cases (50/52) of colorectal adenocarcinomas, including all grades of differentiation and malignancy and also metastatic tumors, whereas CK 20 immunostaining in gastric carcinomas has been found less consistent and more heterogeneous. The possible biological meaning of the specific expression of the CK 20 gene in certain cells of the gastrointestinal tract and carcinomas derived therefrom and the regulatory mechanisms involved in the integration of the protein in the IF cytoskeleton are discussed.