Multiple origins of reverse transcriptases linked to CRISPR-Cas systems

RNA Biol. 2019 Oct;16(10):1486-1493. doi: 10.1080/15476286.2019.1639310. Epub 2019 Jul 11.

Abstract

Prokaryotic genomes harbour a plethora of uncharacterized reverse transcriptases (RTs). RTs phylogenetically related to those encoded by group-II introns have been found associated with type III CRISPR-Cas systems, adjacent or fused at the C-terminus to Cas1. It is thought that these RTs may have a relevant function in the CRISPR immune response mediating spacer acquisition from RNA molecules. The origin and relationships of these RTs and the ways in which the various protein domains evolved remain matters of debate. We carried out a large survey of annotated RTs in databases (198,760 sequences) and constructed a large dataset of unique representative sequences (9,141). The combined phylogenetic reconstruction and identification of the RTs and their various protein domains in the vicinity of CRISPR adaptation and effector modules revealed three different origins for these RTs, consistent with their emergence on multiple occasions: a larger group that have evolved from group-II intron RTs, and two minor lineages that may have arisen more recently from Retron/retron-like sequences and Abi-P2 RTs, the latter associated with type I-C systems. We also identified a particular group of RTs associated with CRISPR-cas loci in clade 12, fused C-terminally to an archaeo-eukaryotic primase (AEP), a protein domain (AE-Prim_S_like) forming a particular family within the AEP proper clade. Together, these data provide new insight into the evolution of CRISPR-Cas/RT systems.

Keywords: Abi; CRISPR; Cas; archaeo-eukaryotic primase; diversity-generating retroelement; group II intron; retron; reverse transcriptase.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • CRISPR-Cas Systems*
  • Chromosome Mapping
  • Genetic Linkage
  • Genetic Variation
  • Introns
  • Phylogeny
  • Prokaryotic Cells / metabolism
  • RNA-Directed DNA Polymerase / genetics*
  • RNA-Directed DNA Polymerase / metabolism

Substances

  • RNA-Directed DNA Polymerase

Grants and funding

This work was supported by the Spanish Ministerio de Ciencia, Innovación y Universidades, including ERDF (European Regional Development Funds) research grants [BIO2014-51953-P, BIO2017-82244-P]. A.G-D was supported by a FPU predoctoral fellowship grant from the Ministerio de Economía y Competitividad [FPU15/02714].