Facioscapulohumeral muscular dystrophy (FSHD) is linked to the polymorphic D4Z4 locus on chromosome 4q35. In non-affected individuals, this locus comprises 10-100 tandem copies of members of the 3.3kb dispersed repeat family. Deletions leaving 1-8 such repeats have been associated with FSHD, for which no candidate gene has been identified. We have determined the complete nucleotide sequence of a 13.5kb EcoRI genomic fragment comprising the only two 3.3kb elements left in the affected D4Z4 locus of a patient with FSHD. Sequence analyses demonstrated that the two 3.3kb repeats were identical. They contain a putative promoter that was not previously detected, with a TACAA instead of a TATAA box, and a GC box. Transient expression of a luciferase reporter gene fused to 191bp of this promoter, demonstrated strong activity in transfected human rhabdomyosarcoma TE671 cells that was affected by mutations in the TACAA or GC box. In addition, these 3.3kb repeats include an open reading frame (ORF) starting 149bp downstream from the TACAA box and encoding a 391 residue protein with two homeodomains (DUX4). In-vitro transcription/translation of the ORF in a rabbit reticulocyte lysate yielded two (35)S Cys/ (35)S Met labeled products with apparent molecular weights of 38 and 75kDa on SDS-PAGE, corresponding to the DUX4 monomer and dimer, respectively. In conclusion, we propose that each of the 3.3kb elements in the partially deleted D4Z4 locus could include a DUX4 gene encoding a double homeodomain protein.