The genetic organization and sequence heterogeneity of the iceA locus of Helicobacter pylori was studied, and the existence of two distinct gene families, iceA1 and iceA2, at this locus was confirmed. iceA1 has significant sequence homology to nlaIIIR, encoding an endonuclease in Neisseria lactamica, but the similarity at the protein level is limited, due to frameshift mutations of iceA1 in most H. pylori strains. In only five of the 19 iceA1 strains studied, a full-length open reading frame (ORF), capable of encoding a 228aa protein, with 52% homology to NlaIII was observed. The region upstream of iceA2 is highly variable in length, containing up to 15 copies of 8bp tandem repeats. iceA2 can encode proteins of 24, 59, 94, or 129 amino acids, consisting of 14 and 10aa domains, conserved in all iceA2 strains, flanking 0, 1, 2, or 3 copies of a 35aa cassette. This 35aa cassette consists of domains of 13, 16 and 6aa, respectively. The 13aa and 6aa domains are highly conserved, but the 16aa domain exists in two variants. In total, five distinct iceA2 subtypes were defined. Database searches did not reveal any homologous sequences. Recombinant IceA1 and IceA2 proteins were expressed in Escherichia coli, confirming the predicted ORFs. Genotype-specific PCR primers permitted iceA genotyping in 318 (99. 1%) of a worldwide collection of 321 H. pylori strains. The conserved sizes of the amplification products confirmed the worldwide distribution of discrete variants of iceA1 and iceA2.