We have compared the genomes of 49 bacteriophages related to T4. PCR analysis of six chromosomal regions reveals two types of local sequence variation. In four loci, we found only two alternative configurations in all the genomes that could be analyzed. In contrast, two highly polymorphic loci exhibit variations in the number, the order and the identity of the sequences present. In phage T4, both highly polymorphic loci encode internal proteins (IPs) that are encapsidated in the phage particle and injected with the viral DNA. Among the various T4-related phages, 10 different ORFs have been identified in the IP loci; their amino acid sequences have the characteristics of internal proteins. At the beginning of each of these coding sequences is a highly conserved 11 amino acid leader motif. In addition, both 5' and 3' to most of these ORFs, there is a approximately 70 bp sequence that contains a T4 early promoter sequence with an overlapping inversely repeated sequence. The homologies within these flanking sequences may mediate the recombinational shuffling of the IP sequences within the locus. A role for the new IP-like sequences in determining the phage host range is proposed since such a role has been previously demonstrated for the IP1 gene of T4.