Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Feb;190(4):1401-12.
doi: 10.1128/JB.01415-07. Epub 2007 Dec 7.

Diversity, Activity, and Evolution of CRISPR Loci in Streptococcus Thermophilus

Affiliations
Free PMC article

Diversity, Activity, and Evolution of CRISPR Loci in Streptococcus Thermophilus

Philippe Horvath et al. J Bacteriol. .
Free PMC article

Abstract

Clustered regularly interspaced short palindromic repeats (CRISPR) are hypervariable loci widely distributed in prokaryotes that provide acquired immunity against foreign genetic elements. Here, we characterize a novel Streptococcus thermophilus locus, CRISPR3, and experimentally demonstrate its ability to integrate novel spacers in response to bacteriophage. Also, we analyze CRISPR diversity and activity across three distinct CRISPR loci in several S. thermophilus strains. We show that both CRISPR repeats and cas genes are locus specific and functionally coupled. A total of 124 strains were studied, and 109 unique spacer arrangements were observed across the three CRISPR loci. Overall, 3,626 spacers were analyzed, including 2,829 for CRISPR1 (782 unique), 173 for CRISPR2 (16 unique), and 624 for CRISPR3 (154 unique). Sequence analysis of the spacers revealed homology and identity to phage sequences (77%), plasmid sequences (16%), and S. thermophilus chromosomal sequences (7%). Polymorphisms were observed for the CRISPR repeats, CRISPR spacers, cas genes, CRISPR motif, locus architecture, and specific sequence content. Interestingly, CRISPR loci evolved both via polarized addition of novel spacers after exposure to foreign genetic elements and via internal deletion of spacers. We hypothesize that the level of diversity is correlated with relative CRISPR activity and propose that the activity is highest for CRISPR1, followed by CRISPR3, while CRISPR2 may be degenerate. Globally, the dynamic nature of CRISPR loci might prove valuable for typing and comparative analyses of strains and microbial populations. Also, CRISPRs provide critical insights into the relationships between prokaryotes and their environments, notably the coevolution of host and viral genomes.

Figures

FIG. 1.
FIG. 1.
S. thermophilus CRISPR3 locus overview. (A) CRISPR3 locus in the LMD-9 genome; (B) CRISPR3 locus in the genome of CNRZ1066, also present in the LMG 18311 genome; (C) CRISPR3 locus in strain DGCC7984, without cas genes; (D) CRISPR3 locus in strain DGCC7857, without cas genes and a repeat-spacer region.
FIG. 2.
FIG. 2.
Graphic representation of spacers across the three CRISPR loci for a variety of S. thermophilus strains. Repeats are not included; only spacers are represented. Each spacer is represented by a combination of one select character in a particular font color, on a particular background color. The color combination allows unique representation of a particular spacer, whereby squares with similar color schemes (combination of character color and background color) represent identical spacers, whereas different color combinations represent distinguishable spacers. Deleted spacers are represented by crossed squares. L1, L2, and L3, CRISPR leader sequences. Left, CRISPR1; center, CRISPR2; right, CRISPR3. Question marks and empty spaces indicate elements that were not sequenced.
FIG. 2.
FIG. 2.
Graphic representation of spacers across the three CRISPR loci for a variety of S. thermophilus strains. Repeats are not included; only spacers are represented. Each spacer is represented by a combination of one select character in a particular font color, on a particular background color. The color combination allows unique representation of a particular spacer, whereby squares with similar color schemes (combination of character color and background color) represent identical spacers, whereas different color combinations represent distinguishable spacers. Deleted spacers are represented by crossed squares. L1, L2, and L3, CRISPR leader sequences. Left, CRISPR1; center, CRISPR2; right, CRISPR3. Question marks and empty spaces indicate elements that were not sequenced.
FIG. 3.
FIG. 3.
Putative secondary structures of the three S. thermophilus CRISPR repeats. Putative structures were predicted by using the Mfold program (26). (A) Putative structures of paired CRISPR repeats for the three loci; (B) putative structure obtained by pairing of six consecutive CRISPR3 repeats.
FIG. 4.
FIG. 4.
Overview of the three S. thermophilus CRISPR loci in the LMD-9 genome. The cas genes are shown in black. Numbers within the genes indicate the genomic ORF number. Numbers on the gray shading indicate percent identity (top) and percent similarity (bottom) between homologous Cas protein sequences. Other Cas protein sequences do not share significant similarity.
FIG. 5.
FIG. 5.
Coclustering of CRISPR repeats and Cas sequences. CRISPR repeats and Cas sequences were aligned by using CLUSTAL X (15). (A) Analysis of various CRISPR repeats; (B) analysis of concatenated Cas sequences; (C) analysis of Cas1 sequences. Several sequences were retrieved from the CRISPRdb database (9) or found by using CRISPRFinder (10). Lin, Listeria innocua Clip11262 (AL592022); Mbo, Mycobacterium bovis BCG Pasteur 1173P2 (AM408590); Mtu, Mycobacterium tuberculosis F11 (CP000717); Sag, Streptococcus agalactiae A909 (CP000114); Sep, Staphylococcus epidermidis RP62A (CP000029); Smu, Streptococcus mutans UA159 (AE014133); Spy, Streptococcus pyogenes MGAS5005 (CP000017); Ssa, Streptococcus sanguinis SK36 (CP000387); Streptococcus suis 89/1591 (AAFA00000000); Sth, Streptococcus thermophilus LMD-9 (CP000419). CR1, CRISPR1; CR2, CRISPR2; CR3, CRISPR3; CRa and CRb, two CRISPR loci in S. pyogenes.
FIG. 6.
FIG. 6.
CRISPR spacer size variability. The x axis represents the size of a CRISPR spacer, in nucleotides. The y axis represents the number of CRISPR spacer sequences of a given size. (A) CRISPR1 spacers; (B) CRISPR2 spacers; (C) CRISPR3 spacers.
FIG. 7.
FIG. 7.
CRISPR motifs identified in the vicinity of the CRISPR proto-spacers. (A) Motif identified in the vicinity of CRISPR1 proto-spacers in the genome of the phage used in the challenge; (B) motif identified in the vicinity of CRISPR3 proto-spacers in the genome of the phage used in the challenge. Conserved sequence motifs were visualized by using WebLogo (6).

Similar articles

See all similar articles

Cited by 262 articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback