The sequence of the human Gc gene, including 4228 base pairs of the 5'-flanking region and 8514 base pairs of the 3' flanking region (55,136 in total), was determined from five overlapping lambda phage clones. The sequence spans 42,394 base pairs from the cap site to the polyadenylation site, and it reveals that the gene is composed of 13 exons, which are symmetrically placed within the three domains of the Gc protein. The first exon is partially untranslated, as is exon 12, which contains the termination codon TAG. Exon 13 is entirely untranslated, but contains the polyadenylation signal AATAAA. Ten central introns split the coding sequence between codon positions 2 and 3 and between codon positions 3 and 1 in an alternating pattern, exactly as has been observed in the structure of the albumin and alpha-fetoprotein genes. The Gc gene has several distinctive features which set it apart from the other members of the family. First, the gene is smaller by two exons, which results in a protein some 130 amino acids shorter than albumin or AFP. This decrease in size may result from the loss of two internal exons during the evolutionary history of the Gc gene. Second, exons 6, 8, 9, and 11 are smaller than their counterparts in albumin or AFP by a total of 8 codons (1, 4, 1, and 2, respectively). Although the mRNA and protein expressed from the Gc gene are significantly smaller, the gene itself is about 2.5 times larger than the other genes of the family.(ABSTRACT TRUNCATED AT 250 WORDS)