Efficient C•G-to-G•C base editors developed using CRISPRi screens, target-library analysis, and machine learning

Nat Biotechnol. 2021 Nov;39(11):1414-1425. doi: 10.1038/s41587-021-00938-z. Epub 2021 Jun 28.


Programmable C•G-to-G•C base editors (CGBEs) have broad scientific and therapeutic potential, but their editing outcomes have proved difficult to predict and their editing efficiency and product purity are often low. We describe a suite of engineered CGBEs paired with machine learning models to enable efficient, high-purity C•G-to-G•C base editing. We performed a CRISPR interference (CRISPRi) screen targeting DNA repair genes to identify factors that affect C•G-to-G•C editing outcomes and used these insights to develop CGBEs with diverse editing profiles. We characterized ten promising CGBEs on a library of 10,638 genomically integrated target sites in mammalian cells and trained machine learning models that accurately predict the purity and yield of editing outcomes (R = 0.90) using these data. These CGBEs enable correction to the wild-type coding sequence of 546 disease-related transversion single-nucleotide variants (SNVs) with >90% precision (mean 96%) and up to 70% efficiency (mean 14%). Computational prediction of optimal CGBE-single-guide RNA pairs enables high-purity transversion base editing at over fourfold more target sites than achieved using any single CGBE variant.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • CRISPR-Cas Systems / genetics
  • Clustered Regularly Interspaced Short Palindromic Repeats*
  • Gene Editing*
  • Machine Learning
  • Mammals / genetics
  • RNA, Guide / genetics


  • RNA, Guide

Associated data

  • figshare/0.6084/m9.figshare.12275645
  • figshare/10.6084/m9.figshare.12275654