The complete sequence analysis of the 210-kb Shigella flexneri 5a virulence plasmid was determined. Shigella spp. cause dysentery and diarrhea by invasion and spread through the colonic mucosa. Most of the known Shigella virulence determinants are encoded on a large plasmid that is unique to virulent strains of Shigella and enteroinvasive Escherichia coli; these known genes account for approximately 30 to 35% of the virulence plasmid. In the complete sequence of the virulence plasmid, 286 open reading frames (ORFs) were identified. An astonishing 153 (53%) of these were related to known and putative insertion sequence (IS) elements; no known bacterial plasmid has previously been described with such a high proportion of IS elements. Four new IS elements were identified. Fifty putative proteins show no significant homology to proteins of known function; of these, 18 have a G+C content of less than 40%, typical of known virulence genes on the plasmid. These 18 constitute potentially unknown virulence genes. Two alleles of shet2 and five alleles of ipaH were also identified on the plasmid. Thus, the plasmid sequence suggests a remarkable history of IS-mediated acquisition of DNA across bacterial species. The complete sequence will permit targeted characterization of potential new Shigella virulence determinants.