The mRNA encoding human thyroglobulin has been cloned and sequenced. It is made up of a 8301-nucleotide segment encoding a preprotein monomer of 2767 amino acids, flanked by non-coding 5' and 3' regions of 41 and 106 nucleotides, respectively. This preprotein consists of a leader sequence of 19 amino acids, followed by the sequence of the mature monomer, corresponding to a polypeptide of 2748 amino acids (Mr = 302773). On its amino-terminal side, 70% of the monomer is characterized by the presence of three types of repetitive units. In contrast, the remaining 30% of the protein is devoid of repetitive units. This last region however shows an interesting homology (up to 64%) with the acetylcholinesterase of Torpedo californica. The sites of thyroid hormones synthesis are clustered at both ends of the thyroglobulin monomer. By contrast, the potential glycosylation sites are scattered along the polypeptide chain.