Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Sep 18;48(16):8828-8847.
doi: 10.1093/nar/gkaa635.

Evolutionary and functional classification of the CARF domain superfamily, key sensors in prokaryotic antivirus defense

Affiliations

Evolutionary and functional classification of the CARF domain superfamily, key sensors in prokaryotic antivirus defense

Kira S Makarova et al. Nucleic Acids Res. .

Abstract

CRISPR-associated Rossmann Fold (CARF) and SMODS-associated and fused to various effector domains (SAVED) are key components of cyclic oligonucleotide-based antiphage signaling systems (CBASS) that sense cyclic oligonucleotides and transmit the signal to an effector inducing cell dormancy or death. Most of the CARFs are components of a CBASS built into type III CRISPR-Cas systems, where the CARF domain binds cyclic oligoA (cOA) synthesized by Cas10 polymerase-cyclase and allosterically activates the effector, typically a promiscuous ribonuclease. Additionally, this signaling pathway includes a ring nuclease, often also a CARF domain (either the sensor itself or a specialized enzyme) that cleaves cOA and mitigates dormancy or death induction. We present a comprehensive census of CARF and SAVED domains in bacteria and archaea, and their sequence- and structure-based classification. There are 10 major families of CARF domains and multiple smaller groups that differ in structural features, association with distinct effectors, and presence or absence of the ring nuclease activity. By comparative genome analysis, we predict specific functions of CARF and SAVED domains and partition the CARF domains into those with both sensor and ring nuclease functions, and sensor-only ones. Several families of ring nucleases functionally associated with sensor-only CARF domains are also predicted.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Relationships between CARF and SAVED domain-containing protein sequences. (A) Dendrogram built from the alignment of CARF and SAVED domain sequences only. The dendrogram was built using the ‘hybrid’ approach for sequence classification. Briefly, the FastTree program was used to infer relationships within alignable clusters, and the relationships between these clusters were inferred from HHalign pairwise scores using the matrix-based UPGMA method as described in detail previously (1). Distinct major alignable clusters are color coded. (B) Dendrogram built using alignment of complete amino acid sequences of CARF domain-containing proteins. Major and minor CARF clades corresponding to well-supported branches that include five or more sequences from diverse genomes are shown schematically on the right. The color coding is the same as in panel A. CARF_m13 group sequences are highly divergent and are included only in the second dendrogram. (C) Dendrogram built from the alignment of complete amino acid sequences of SAVED domain-containing proteins. Seven SAVED clades corresponding to well-supported branches that include 5 or more sequences from diverse genomes are shown schematically on the right. The color coding is the same as in panel A. The dendrograms in panels B and C were built using the same approach as the dendrogram in panel A. The subtrees including the sequences from the major cluster CARF1 (red) and SAVED (blue) were extracted from the tree built using complete protein sequences. Common names used in the literature are indicated in parentheses.
Figure 2.
Figure 2.
Association of different CARF and SAVED domain clades with CRISPR–Cas systems. The relative frequencies of CRISPR–Cas systems associated with distinct CARF and SAVED clades is shown.
Figure 3.
Figure 3.
Structures of the CARF domain containing proteins. (A) Schematic representation of the conserved core of the CARF fold. Motifs I and II (corresponding to β1-α1 and β4-α4 junctions, respectively) are involved in cyclic oligoadenylate binding/cleavage activities. (B) Superimposed structures of 11 CARF domains colored according to chain progression from N-terminus (blue) to C-terminus (red). Non-conserved loops/insertions were removed for clarity. (C) Selected structures of CARF proteins with different domain architectures. Domain homodimers (different chains) or heterodimers (single chain as in TtCan1) are represented by different shades of the same color. All CARF domain dimers have a cleft (blue mesh) in the corresponding structural regions. This cleft (pocket) is where a cyclic oligoadenylate (shown in pink and labeled) binds. CARF domains, orange; domains in topologically equivalent positions following CARF (Csx1 connector domain, Csm6 6H domain and wHTH domain), light blue; toxin domains (HEPN, wHTH-HEPN and PD-D/ExK), green.
Figure 4.
Figure 4.
Sequence motifs of ring nucleases. Conserved motifs in different groups of CARF ring nucleases are represented as sequence logos. For each CARF group, a number of sequences in the group and known representatives are indicated in parentheses. Motifs I and II are framed. Positions of residues in motif-I and motif-II known to be important for binding/catalysis of cOA are indicated by green and blue stars, respectively. Additional conserved Glu (motif-IA) in CARF7 and CARF_m4 groups as well as structurally analogous conserved Asp (motif-I) in CARF1b predicted to be important for ring nuclease activity are indicated with a red star. CARF9 is represented by close homologs of ToCsx1 protein, in which Trp14 was identified as a catalytic residue (enclosed in red frame).
Figure 5.
Figure 5.
Analysis of the components of the cOA signaling pathway (A) Representation of different CARF groups in type III CRISPR-cas loci. Vertical axis, number of type III loci that encode the where respective CARF. Orange bars show the loci with a single CARF domain containing gene with additional requirement that no previously reported type III associated non-CARF genes (1) or CARF proteins from other groups are present in the respective genome (solo-CARF loci). Blue bars show the remaining loci. (B) Co-occurrence of different families of CARFs and type III associated proteins in the genome encoding only one such protein, in addition to CARF2, CARF4 or CARF9 (‘double’ CARF loci). The vertical axis is the number of ‘double’ CARF loci. (C) Co-occurrence of different families of CARFs and type III associated genes with three major groups of CARFs predicted to lack ring nuclease activity (CARF2, CARF4 and CARF9). The vertical axis is the number of CARF loci. (D) Domain organizations of selected proteins containing known or predicted ring nuclease domains. The domains are shown approximately to scale. The domain family name is indicated above each bar.
Figure 6.
Figure 6.
Functional organization of selected CARF domain-encoding loci. For each locus, species name and genome accession number are indicated. Genes are shown by arrows roughly to scale. Dashed line between arrows indicated that the loci encoded far from each other. Arrows are color-coded according to the scheme below. The gene names largely follow the nomenclature from (1), but the RAMP proteins of groups 5 and 7 and small subunits are denoted gr5, gr7 and gr11, respectively. The CRISPR–Cas system subtype is indicated on the left. Experimentally characterized ring nucleases are denoted by small red circle above the respective arrows.
Figure 7.
Figure 7.
Updated scheme of the cOA signaling pathway. The general scheme of cOA signaling pathway is shown on top, followed by experimentally characterized and predicted components of the pathway classified into five functional categories within gray shapes. Distinct proteins families are shown by oval shapes. CARF families are shown by dark green, and other proteins are shown by oval shapes of different colors, including shades of blue for uncharacterized membrane proteins and shades of gray for protein families without similarity to any characterized proteins. Experimentally characterized ring nucleases are shown by thick yellow outline, and predicted ring nucleases are shown by thick dashed yellow outline in panel A. Components are denoted by general family name if known, CRISPR–Cas ancillary protein names are indicated according to the current nomenclature of cas genes (1). Abbreviations: cOA, cyclic oligoadenylates; CARF, CRISPR-associated Rossmann Fold; WYL, predicted ligand-binding domain associated with many CRISPR–Cas systems (named after the respective amino acids that are partly conserved in the family); HEPN, PIN, RelE, ribonucleases of the respective families; HTH, helix-turn-helix DNA-binding domain; HD, PD-D/ExK, nuclease (or phosphatases) of the respective superfamilies; CorA, divalent cation channel or its homolog.

Similar articles

Cited by

References

    1. Makarova K.S., Wolf Y.I., Iranzo J., Shmakov S.A., Alkhnbashi O.S., Brouns S.J.J., Charpentier E., Cheng D., Haft D.H., Horvath P. et al. .. Evolutionary classification of CRISPR–Cas systems: a burst of class 2 and derived variants. Nat. Rev. Microbiol. 2020; 18:67–83. - PMC - PubMed
    1. Mohanraju P., Makarova K.S., Zetsche B., Zhang F., Koonin E.V., van der Oost J.. Diverse evolutionary roots and mechanistic variations of the CRISPR–Cas systems. Science. 2016; 353:aad5147. - PubMed
    1. Hille F., Richter H., Wong S.P., Bratovic M., Ressel S., Charpentier E.. The Biology of CRISPR–Cas: backward and Forward. Cell. 2018; 172:1239–1259. - PubMed
    1. Amitai G., Sorek R.. CRISPR–Cas adaptation: insights into the mechanism of action. Nat. Rev. Microbiol. 2016; 14:67–76. - PubMed
    1. Jackson S.A., McKenzie R.E., Fagerlund R.D., Kieper S.N., Fineran P.C., Brouns S.J.. CRISPR–Cas: adapting to change. Science. 2017; 356:eaal5056. - PubMed

Publication types