Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jan 8;48(D1):D535-D544.
doi: 10.1093/nar/gkz915.

CRISPRCasdb a Successor of CRISPRdb Containing CRISPR Arrays and Cas Genes From Complete Genome Sequences, and Tools to Download and Query Lists of Repeats and Spacers

Affiliations
Free PMC article

CRISPRCasdb a Successor of CRISPRdb Containing CRISPR Arrays and Cas Genes From Complete Genome Sequences, and Tools to Download and Query Lists of Repeats and Spacers

Christine Pourcel et al. Nucleic Acids Res. .
Free PMC article

Abstract

In Archaea and Bacteria, the arrays called CRISPRs for 'clustered regularly interspaced short palindromic repeats' and the CRISPR associated genes or cas provide adaptive immunity against viruses, plasmids and transposable elements. Short sequences called spacers, corresponding to fragments of invading DNA, are stored in-between repeated sequences. The CRISPR-Cas systems target sequences homologous to spacers leading to their degradation. To facilitate investigations of CRISPRs, we developed 12 years ago a website holding the CRISPRdb. We now propose CRISPRCasdb, a completely new version giving access to both CRISPRs and cas genes. We used CRISPRCasFinder, a program that identifies CRISPR arrays and cas genes and determine the system's type and subtype, to process public whole genome assemblies. Strains are displayed either in an alphabetic list or in taxonomic order. The database is part of the CRISPR-Cas++ website which also offers the possibility to analyse submitted sequences and to download programs. A BLAST search against lists of repeats and spacers extracted from the database is proposed. To date, 16 990 complete prokaryote genomes (16 650 bacteria from 2973 species and 340 archaea from 300 species) are included. CRISPR-Cas systems were found in 36% of Bacteria and 75% of Archaea strains. CRISPRCasdb is freely accessible at https://crisprcas.i2bc.paris-saclay.fr/.

Figures

Figure 1.
Figure 1.
Workflow for the development of CRISPRCasdb. (A) Workflow for the recovery of genome sequences and associated data, CRISPRCasFinder calculation, storage and display of data. (B) Implementation of CRISPRCasdb-BLAST. Sequences provided in the output of CRISPRCasdb, CRISPRCasFinder, CRISPRCasMeta or directly submitted by users can be blasted against lists of repeats and spacers from the database.
Figure 2.
Figure 2.
Screenshots of the browse page and output in CRISPRCasdb. (1) Selecting a strain leads to a schematic representation of the genome with the position of CRISPR arrays and cas clusters present in the genome(s) and a table with their position, sequence of the consensus repeats and name of cas genes. (2) The CRISPR array is depicted with repeats coloured in yellow and spacers with different colours. (3) Selected sequences (repeat or spacer) can be blasted against lists present in the database.
Figure 3.
Figure 3.
Evaluation of total number of spacers per strain. (A) Total number of spacers present in 6769 bacterial genomes and 282 archaeal genomes. (B) Genomes are distributed in percentage in function of the number of spacers.
Figure 4.
Figure 4.
Repeat size distribution. (A) x is the repeats size in bp and y is the number of CRISPR arrays. (B) x is the repeats size in bp and y is the percentage of CRISPR arrays.
Figure 5.
Figure 5.
Relative size of spacers and repeats. x is the total size of ‘repeat + spacer’ and y is the percentage of size occurrence.
Figure 6.
Figure 6.
Number of Cas types and subtypes. The different CRISPR–Cas subtypes are shown on the x axis and the percentage of genomes are shown on the y axis.

Similar articles

See all similar articles

Cited by 2 articles

References

    1. Makarova K.S., Haft D.H., Barrangou R., Brouns S.J., Charpentier E., Horvath P., Moineau S., Mojica F.J., Wolf Y.I., Yakunin A.F. et al. . Evolution and classification of the CRISPR–Cas systems. Nat. Rev. Microbiol. 2011; 9:467–477. - PMC - PubMed
    1. Nakata A., Amemura M., Makino K. Unusual nucleotide arrangement with repeated sequences in the Escherichia coli K-12 chromosome. J. Bacteriol. 1989; 171:3553–3556. - PMC - PubMed
    1. Groenen P.M., Bunschoten A.E., van Soolingen D., van Embden J.D. Nature of DNA polymorphism in the direct repeat cluster of Mycobacterium tuberculosis; application for strain differentiation by a novel typing method. Mol. Microbiol. 1993; 10:1057–1065. - PubMed
    1. Mojica F.J., Ferrer C., Juez G., Rodriguez-Valera F. Long stretches of short tandem repeats are present in the largest replicons of the Archaea Haloferax mediterranei and Haloferax volcanii and could be involved in replicon partitioning. Mol. Microbiol. 1995; 17:85–93. - PubMed
    1. Mojica F.J., Diez-Villasenor C., Soria E., Juez G. Biological significance of a family of regularly spaced repeats in the genomes of Archaea, Bacteria and mitochondria. Mol. Microbiol. 2000; 36:244–246. - PubMed

Publication types

Substances

Feedback