CROTON: an automated and variant-aware deep learning framework for predicting CRISPR/Cas9 editing outcomes

Bioinformatics. 2021 Jul 12;37(Suppl_1):i342-i348. doi: 10.1093/bioinformatics/btab268.

Abstract

Motivation: CRISPR/Cas9 is a revolutionary gene-editing technology that has been widely utilized in biology, biotechnology and medicine. CRISPR/Cas9 editing outcomes depend on local DNA sequences at the target site and are thus predictable. However, existing prediction methods are dependent on both feature and model engineering, which restricts their performance to existing knowledge about CRISPR/Cas9 editing.

Results: Herein, deep multi-task convolutional neural networks (CNNs) and neural architecture search (NAS) were used to automate both feature and model engineering and create an end-to-end deep-learning framework, CROTON (CRISPR Outcomes Through cONvolutional neural networks). The CROTON model architecture was tuned automatically with NAS on a synthetic large-scale construct-based dataset and then tested on an independent primary T cell genomic editing dataset. CROTON outperformed existing expert-designed models and non-NAS CNNs in predicting 1 base pair insertion and deletion probability as well as deletion and frameshift frequency. Interpretation of CROTON revealed local sequence determinants for diverse editing outcomes. Finally, CROTON was utilized to assess how single nucleotide variants (SNVs) affect the genome editing outcomes of four clinically relevant target genes: the viral receptors ACE2 and CCR5 and the immune checkpoint inhibitors CTLA4 and PDCD1. Large SNV-induced differences in CROTON predictions in these target genes suggest that SNVs should be taken into consideration when designing widely applicable gRNAs.

Availability and implementation: https://github.com/vli31/CROTON.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • CRISPR-Cas Systems*
  • Clustered Regularly Interspaced Short Palindromic Repeats
  • Deep Learning*
  • Gene Editing
  • Neural Networks, Computer
  • RNA, Guide

Substances

  • RNA, Guide