Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 1998 Aug;66(8):3810-7.

Molecular Evolution of a Pathogenicity Island From Enterohemorrhagic Escherichia Coli O157:H7

Affiliations
Free PMC article

Molecular Evolution of a Pathogenicity Island From Enterohemorrhagic Escherichia Coli O157:H7

N T Perna et al. Infect Immun. .
Free PMC article

Abstract

We report the complete 43,359-bp sequence of the locus of enterocyte effacement (LEE) from EDL933, an enterohemorrhagic Escherichia coli O157:H7 serovar originally isolated from contaminated hamburger implicated in an outbreak of hemorrhagic colitis. The locus was isolated from the EDL933 chromosome with a homologous-recombination-driven targeting vector. Recent completion of the LEE sequence from enteropathogenic E. coli (EPEC) E2348/69 afforded the opportunity for a comparative analysis of the entire pathogenicity island. We have identified a total of 54 open reading frames in the EDL933 LEE. Of these, 13 fall within a putative P4 family prophage designated 933L. The prophage is not present in E2348/69 but is found in a closely related EPEC O55:H7 serovar and other O157:H7 isolates. The remaining 41 genes are shared by the two complete LEEs, and we describe the nature and extent of variation among the two strains for each gene. The rate of divergence is heterogeneous along the locus. Most genes show greater than 95% identity between the two strains, but other genes vary more than expected for clonal divergence among E. coli strains. Several of these highly divergent genes encode proteins that are known to be involved in interactions with the host cell. This pattern suggests recombinational divergence coupled with natural selection and has implications for our understanding of the interaction of both pathogens with their host, for the emergence of O157:H7, and for the evolutionary history of pathogens in general.

Figures

FIG. 1
FIG. 1
Diagram of the EDL933 LEE. (A) ORFs are shown above and below the line to indicate the direction of transcription. Genes of the putative prophage are shown in black. Genes common to both the EDL933 and E2348/69 LEEs are shown in white. Genes homologous to those on the K-12 chromosome are hatched. The number of synonymous changes per synonymous site (dS) and (B) the number of nonsynonymous changes per nonsynonymous site (dN) (C) are shown for each ORF shared with the EPEC E2348/69 LEE (values are taken from Table 2). A scale (in base pairs) is shown along the bottom.
FIG. 2
FIG. 2
Alignment of the two repeats flanking 933L, the selC end of the E2348/69 LEE, and the end of PAI-1 from uropathogenic E. coli (GenBank accession no., M13943). Residues that match the first copy of the repeat in EDL933 are represented by a dot. Coordinates shown to the left of the alignment correspond to the sequence position in each GenBank entry.
FIG. 3
FIG. 3
PCR to detect 933L in genomic DNA from MG1655, EDL933, 5624-50, and E2348/69 with primers leephage-f and leephage-r from Table 1. One-half microliter of the reaction mixture was loaded on a 1% horizontal agarose gel in Tris-acetate-EDTA buffer and electrophoresed at 40 V for 1.5 h to visualize the product. The sizes of bands in Lambda DNA/HindIII markers (λHindIII) from Promega and a Low DNA MASS ladder (LDML) from GIBCO BRL in kilobase pairs are shown.
FIG. 4
FIG. 4
Alignment of the repetitive motifs of EDL933 gene L0016 and the E2348/69 homolog. Coordinates shown to the left of the alignment correspond to the sequence position in each GenBank entry. Each sequence is labeled with the repeat number and subscripted strain designation (R1EDL933, R2EDL933, etc.). The repeat is in frame in both strains, and the putative peptide sequence is shown above the alignment with asterisks replacing residues that are not absolutely conserved. Dots indicate nucleotides identical to those in the R1EDL933 sequence. Dashes have been added to maximize the similarity among sequences.
FIG. 5
FIG. 5
Evolutionary relationships among strains used in this study and 20 E. coli reference strains based on mdh sequences. A phylogeny was reconstructed with published data from the ECOR collection (6, 38) (GenBank accession no., U04742 through U04758, U04770, and AF004201) and our EDL933, E2348/69, and DEC strain mdh sequences. The tree is based on a MegAlign (DNASTAR) multiple sequence alignment of positions 34 through 878 of the 916-bp mdh coding region, excluding positions 652 and 653. The phylogeny was reconstructed with a Kimura two-parameter distance matrix and the neighbor-joining algorithim. The percentages of 2,000 bootstrap replicates supporting each cluster are shown along the branches.

Similar articles

See all similar articles

Cited by 154 articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback