Background: Despite being one of the most intensely studied model organisms, many questions still remain about the evolutionary biology and ecology of Escherichia coli. An important step toward achieving a more complete understanding of E.coli biology entails elucidating relationships between gene content and adaptation to the ecological niche.
Results: Here, we present genome comparisons of 16 E.coli strains that represent commensals and pathogens isolated from infants during a specific time period in Trondheim, Norway. Using differential gene content, we characterized enrichment profiles of the collection of strains relating to phylogeny, early vs. late colonization, pathogenicity and growth rate. We found clear gene content distinctions relating to the various grouping criteria. We also found that different categories of strains use different genetic elements for similar biological processes. The sequenced genomes included two pairs of strains where each pair was isolated from the same infant at different time points. One pair, in which the strains were isolated four months apart, showed maintenance of an early colonizer genome profile but also gene content and codon usage changes toward the late colonizer profile. Lastly, we placed our sequenced isolates into a broader genomic context by comparing them with 25 published E.coli genomes that represent a variety of pathotypes and commensal strains. This analysis demonstrated the importance of geography in shaping strain level gene content profiles.
Conclusions: Our results indicate a general pattern where alternative genetic pathways lead toward a consistent ecological role for E.coli as a species. Within this framework however, we saw selection shaping the coding repertoire of E.coli strains toward distinct ecotypes with different phenotypic properties.