The sequence of the tandem repeat sequence (D4Z4) associated with facioscapulohumeral muscular dystrophy (FSHD) has been determined: each copy of the 3.3 kb repeat contains two homeoboxes and two previously described repetitive sequences, LSau and a GC-rich low copy repeat designated hhspm3. By Southern blotting, FISH and isolation of cDNA and genomic clones we show that there are repeat sequences similar to D4Z4 at other locations in the human genome. Southern blot analysis of primate genomic DNA indicates that the copy number of D4Z4-like repeats has increased markedly within the last 25 million years. Two cDNA clones were isolated and found to contain stop codons and frameshifts within the homeodomains. An STS was produced to the cDNAs and analysis of a somatic cell hybrid panel suggests they map to chromosome 14. No cDNA clones mapping to the chromosome 4q35 D4Z4 repeats have been identified, although the possibility that they encode a protein cannot be ruled out. Although D4Z4 may not encode a protein, there is an association between deletions within this locus and FSHD. The D4Z4 repeats contain LSau repeats and are adjacent to 68 bp Sau3A repeats. Both of these sequences are associated with heterochromatic regions of DNA, regions known to be involved in the phenomenon of position effect variegation. We postulate that deletion of D4Z4 sequences could produce a position effect.