Efficient C•G-to-G•C base editors developed using CRISPRi screens, target-library analysis, and machine learning thumbnail

Efficient C•G-to-G•C base editors developed using CRISPRi screens, target-library analysis, and machine learning

Abstract

Programmable C•G-to-G•C base editors (CGBEs) have broad scientific and therapeutic potential, but their editing outcomes have proved difficult to predict and their editing efficiency and product purity are often low. We describe a suite of engineered CGBEs paired with machine learning models to enable efficient, high-purity C•G-to-G•C base editing. We performed a CRISPR interference (CRISPRi) screen targeting DNA repair genes to identify factors that affect C•G-to-G•C editing outcomes and used these insights to develop CGBEs with diverse editing profiles. We characterized ten promising CGBEs on a library of 10,638 genomically integrated target sites in mammalian cells and trained machine learning models that accurately predict the purity and yield of editing outcomes (R = 0.90) using these data. These CGBEs enable correction to the wild-type coding sequence of 546 disease-related transversion single-nucleotide variants (SNVs) with >90% precision (mean 96%) and up to 70% efficiency (mean 14%). Computational prediction of optimal CGBE–single-guide RNA pairs enables high-purity transversion base editing at over fourfold more target sites than achieved using any single CGBE variant.

Access options

Subscribe to Journal

Get full journal access for 1 year

$59.00

only $4.92 per issue

All prices are NET prices.

VAT will be added later in the checkout.

Tax calculation will be finalised during checkout.

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Data availability

The target library sequencing data generated during this study are available at the NCBI Sequence Read Archive database under PRJNA631290. Data from the Repair-seq screens are available under PRJNA721212. Processed target library data used for training machine learning models have been deposited under the following DOIs: https://doi.org/10.6084/m9.figshare.12275645 and https://doi.org/10.6084/m9.figshare.12275654.

Code availability

Code used for analysis of CRISPRi screens is available at https://github.com/jeffhussmann/repair-seq. Codes used for target library data processing and analysis iare available at https://github.com/maxwshen/lib-dataprocessing and https://github.com/maxwshen/lib-analysis, respectively. The machine learning models for CGBEs trained on target library data are available as a part of the BE-Hive interactive web application at https://crisprbehive.design and the BE-Hive Python package at https://github.com/maxwshen/be_predict_efficiency and https://github.com/maxwshen/be_predict_bystander.

References

  1. 1.

    Landrum, M. J. et al. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 44, D862–D868 (2016).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  2. 2.

    Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A. & Liu, D. R. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420–424 (2016).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  3. 3.

    Gaudelli, N. M. et al. Programmable base editing of A•T to G•C in genomic DNA without DNA cleavage. Nature 551, 464–471 (2017).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  4. 4.

    Gehrke, J. M. et al. An APOBEC3A-Cas9 base editor with minimized bystander and off-target activities. Nat. Biotechnol. 36, 977–982 (2018).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  5. 5.

    Nishida, K. et al. Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems. Science 353, aaf8729 (2016).

    PubMed 
    Article 
    CAS 

    Google Scholar
     

  6. 6.

    Richter, M. F. et al. Phage-assisted evolution of an adenine base editor with improved Cas domain compatibility and activity. Nat. Biotechnol. 38, 883–891 (2020).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  7. 7.

    Rees, H. A. & Liu, D. R. Base editing: precision chemistry on the genome and transcriptome of living cells. Nat. Rev. Genet. 19, 770–788 (2018).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  8. 8.

    Anzalone, A. V., Koblan, L. W. & Liu, D. R. Genome editing with CRISPR–Cas nucleases, base editors, transposases and prime editors. Nat. Biotechnol. 38, 824–844 (2020).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  9. 9.

    Gaudelli, N. M. et al. Directed evolution of adenine base editors with increased activity and therapeutic application. Nat. Biotechnol. 38, 892–900 (2020).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  10. 10.

    Mok, B. Y. et al. A bacterial cytidine deaminase toxin enables CRISPR-free mitochondrial base editing. Nature 583, 631–637 (2020).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  11. 11.

    Komor, A. C. et al. Improved base excision repair inhibition and bacteriophage Mu Gam protein yields C:G-to-T:A base editors with higher efficiency and product purity. Sci. Adv. 3, eaao4774 (2017).

    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  12. 12.

    Arbab, M. et al. Determinants of base editing outcomes from target library analysis and machine learning. Cell 182, 463–480 (2020).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  13. 13.

    Kurt, I. C. et al. CRISPR C-to-G base editors for inducing targeted DNA transversions in human cells. Nat. Biotechnol. 39, 41–46 (2020).

    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  14. 14.

    Zhao, D. et al. Glycosylase base editors enable C-to-A and C-to-G base changes. Nat. Biotechnol. 39, 35–40 (2020).

    PubMed 
    Article 
    CAS 

    Google Scholar
     

  15. 15.

    Chen, L. et al. Programmable C:G to G:C genome editing with CRISPR-Cas9-directed base excision repair proteins. Nat. Commun. 12, 1384 (2021).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  16. 16.

    Liu, D. R. & Koblan, L. W. Cytosine to guanine base editor. Patentscope https://patentscope.wipo.int/search/en/detail.jsf?docId=WO2018165629 (2018).

  17. 17.

    Marquart, K. F. et al. Predicting base editing outcomes with an attention-based deep learning algorithm trained on high-throughput target library screens. Preprint at bioRxiv https://doi.org/10.1101/2020.07.05.186544 (2020).

  18. 18.

    Sang, P. B., Srinath, T., Patil, A. G., Woo, E.-J. & Varshney, U. A unique uracil-DNA binding protein of the uracil DNA glycosylase superfamily. Nucleic Acids Res. 43, 8452–8463 (2015).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  19. 19.

    Ahn, W.-C. et al. Covalent binding of uracil DNA glycosylase UdgX to abasic DNA upon uracil excision. Nat. Chem. Biol. 15, 607–614 (2019).

    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  20. 20.

    Tu, J., Chen, R., Yang, Y., Cao, W. & Xie, W. Suicide inactivation of the uracil DNA glycosylase UdgX by covalent complex formation. Nat. Chem. Biol. 15, 615–622 (2019).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  21. 21.

    Hussmann, J. A. et al. Mapping the genetic landscape of DNA double-strand break repair. Preprint at bioRxiv https://doi.org/10.1101/2021.06.14.44834 (2021).

  22. 22.

    Gilbert, L. A. et al. CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell 154, 442–451 (2013).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  23. 23.

    Gallina, I. et al. The ubiquitin ligase RFWD3 is required for translesion DNA synthesis. Molecular Cell 81, 442–458.e9 (2021).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  24. 24.

    Levy, J. M. et al. Cytosine and adenine base editing of the brain, liver, retina, heart and skeletal muscle of mice via adeno-associated viruses. Nat. Biomed. Eng. 4, 97–110 (2020).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  25. 25.

    Kim, Y. B. et al. Increasing the genome-targeting scope and precision of base editing with engineered Cas9-cytidine deaminase fusions. Nat. Biotechnol. 35, 371–376 (2017).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  26. 26.

    Kleinstiver, B. P. et al. High-fidelity CRISPR–Cas9 nucleases with no detectable genome-wide off-target effects. Nature 529, 490–495 (2016).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  27. 27.

    Slaymaker, I. M. et al. Rationally engineered Cas9 nucleases with improved specificity. Science 351, 84–88 (2015).

    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  28. 28.

    Chen, J. S. et al. Enhanced proofreading governs CRISPR–Cas9 targeting accuracy. Nature 550, 407–410 (2017).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  29. 29.

    Lee, J. K. et al. Directed evolution of CRISPR-Cas9 to increase its specificity. Nat. Commun. 9, 3048 (2018).

    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  30. 30.

    Koblan, L. W. et al. Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction. Nat. Biotechnol. 36, 843–846 (2018).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  31. 31.

    Shen, M. W. et al. Predictable and precise template-free CRISPR editing of pathogenic variants. Nature 563, 646–651 (2018).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  32. 32.

    Nishimasu, H. et al. Engineered CRISPR-Cas9 nuclease with expanded targeting space. Science 361, 1259–1262 (2018).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  33. 33.

    Stenson, P. D. et al. Human Gene Mutation Database: towards a comprehensive central mutation database. J. Med. Genet. 45, 124–126 (2007).

    Article 

    Google Scholar
     

  34. 34.

    Frank, M. et al. The type of variants at the COL3A1 gene associates with the phenotype and severity of vascular Ehlers–Danlos syndrome. Eur. J. Hum. Genet. 23, 1657–1664 (2015).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  35. 35.

    Petrucelli, N., Daly, M. B. & Feldman, G. L. Hereditary breast and ovarian cancer due to mutations in BRCA1 and BRCA2. Genet. Med. 12, 245–259 (2010).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  36. 36.

    Douglas, J. et al. NSD1 mutations are the major cause of Sotos syndrome and occur in some cases of Weaver syndrome but are rare in other overgrowth phenotypes. Am. J. Hum. Genet. 72, 132–143 (2003).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  37. 37.

    Luna-Peláez, N. et al. The Cornelia de Lange syndrome-associated factor NIPBL interacts with BRD4 ET domain for transcription control of a common set of genes. Cell Death Dis. 10, 548 (2019).

    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  38. 38.

    Anzalone, A. V. et al. Search-and-replace genome editing without double-strand breaks or donor DNA. Nature 576, 149–157 (2019).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  39. 39.

    Clement, K. et al. CRISPResso2 provides accurate and rapid genome editing sequence analysis. Nat. Biotechnol. 37, 224–226 (2019).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  40. 40.

    Horlbeck, M. A. et al. Compact and highly active next-generation libraries for CRISPR-mediated gene repression and activation. eLife 5, e19760 (2016).

    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  41. 41.

    Gilbert, LukeA. et al. Genome-scale CRISPR-mediated control of gene repression and activation. Cell 159, 647–661 (2014).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  42. 42.

    Gilbert, LukeA. et al. CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell 154, 442–451 (2013).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  43. 43.

    Sherwood, R. I. et al. Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape. Nat. Biotechnol. 32, 171–178 (2014).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  44. 44.

    Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32, 8024–8035 (2019).


    Google Scholar
     

Download references

Acknowledgements

This work was supported by US NIH (nos. U01AI142756, UG3AI150551, RM1HG009490, R35GM118062, R35GM138167 and P30CA072720), HHMI and Princeton University. B.A. acknowledges a Searle Scholars award. The authors acknowledge NSF Graduate Research Fellowships to L.W.K., M.W.S. and T.A.S.; a NWO Rubicon Fellowship to M.A.; a Jane Coffin Childs postdoctoral fellowship to A.V.A.; fellowship support from the NSF and Hertz Foundation to J.L.D.; a Helen Hay Whitney postdoctoral fellowship to G.A.N.; a Damon Runyon Postdoctoral Fellowship to D.Y.; a Singapore A*STAR NSS fellowship to B.M.; and NIH Ruth L. Kirschstein National Research Service Award no. F31NS115380 to J.M.R. J.A.H. was the Rebecca Ridley Kry Fellow of the Damon Runyon Cancer Research Foundation.

Author information

Author notes

  1. Jeffrey A. Hussmann, Dian Yang, Joseph M. Replogle & Jonathan S. Weissman

    Present address: Whitehead Institute for Biomedical Research, Cambridge, MA, USA

  2. Jeffrey A. Hussmann, Dian Yang, Joseph M. Replogle & Jonathan S. Weissman

    Present address: Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA

  3. These authors contributed equally: Luke W. Koblan, Mandana Arbab, Max W. Shen.

Affiliations

  1. Merkin Institute of Transformative Technologies in Healthcare, Broad Institute of Harvard and MIT, Cambridge, MA, USA

    Luke W. Koblan, Mandana Arbab, Max W. Shen, Andrew V. Anzalone, Jordan L. Doman, Gregory A. Newby, Beverly Mok & David R. Liu

  2. Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA

    Luke W. Koblan, Mandana Arbab, Max W. Shen, Andrew V. Anzalone, Jordan L. Doman, Gregory A. Newby, Beverly Mok, Tyler A. Sisley & David R. Liu

  3. Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA

    Luke W. Koblan, Mandana Arbab, Max W. Shen, Andrew V. Anzalone, Jordan L. Doman, Gregory A. Newby, Beverly Mok & David R. Liu

  4. Computational and Systems Biology Program, Massachusetts Institute of Technology, Cambridge, MA, USA

    Max W. Shen

  5. Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA, USA

    Jeffrey A. Hussmann, Dian Yang, Joseph M. Replogle, Albert Xu, Jonathan S. Weissman & Britt Adamson

  6. Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA, USA

    Jeffrey A. Hussmann & Albert Xu

  7. Howard Hughes Medical Institute, University of California, San Francisco, San Francisco, CA, USA

    Jeffrey A. Hussmann, Dian Yang, Joseph M. Replogle, Jonathan S. Weissman & Britt Adamson

  8. Medical Scientist Training Program, University of California, San Francisco, San Francisco, CA, USA

    Joseph M. Replogle, Albert Xu & Jonathan S. Weissman

  9. Tetrad Graduate Program, University of California, San Francisco, San Francisco, CA, USA

    Joseph M. Replogle

  10. Biomedical Sciences Graduate Program, University of California, San Francisco, San Francisco, CA, USA

    Albert Xu

  11. Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA

    Britt Adamson

  12. Department of Molecular Biology, Princeton University, Princeton, NJ, USA

    Britt Adamson

Contributions

L.W.K, M.A., M.W.S., J.A.H., A.V.A., J.S.W., B.A. and D.R.L. designed the research. L.W.K., M.A., M.W.S., J.A.H., A.V.A., J.L.D., G.A.N., D.Y., B.M., J.M.R., A.X., T.A.S. and B.A. performed experiments. J.S.W., B.A. and D.R.L. supervised the project. L.W.K. and D.R.L. wrote the manuscript with input from all authors.

Corresponding authors

Correspondence to
Jonathan S. Weissman or Britt Adamson or David R. Liu.

Ethics declarations

Competing interests

J.A.H. is a consultant for Tessera Therapeutics. J.M.R. is a consultant for Maze Therapeutics. J.S.W. is a consultant for, and holds equity in, Maze Therapeutics, Chroma Medicine and KSQ Therapeutics. B.A. was a member of a ThinkLab Advisory Board for, and holds equity in, Celsius Therapeutics. D.R.L. is a consultant for, and holds equity in, Beam Therapeutics, Prime Medicine, Pairwise Plants and Chroma Medicine. The remaining authors declare no competing interests.

Additional information

Peer review information Nature Biotechnology thanks Jia Chen, Leopold Parts and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–15, Discussion 1–6, Sequences and References.

41587_2021_938_MOESM3_ESM.xlsx

Supplementary Table 1. CRISPRi sgRNA library. Supplementary Table 2. Changes in base editing outcomes for all genes in CRISPRi screens. Supplementary Table 3. Base editing outcomes in a library of disease-related alleles correctable by editing C•G to G•C or to A•T. Supplementary Table 4. CGBE targets, amplicons and oligos used for this study.

Supplementary Data 1

All C•G-to-G•C editing yield, purity and indel outcomes for all experiments in this manuscript. T-tests can be generated for any pairwise comparison in this file.

About this article

Verify currency and authenticity via CrossMark

Cite this article

Koblan, L.W., Arbab, M., Shen, M.W. et al. Efficient C•G-to-G•C base editors developed using CRISPRi screens, target-library analysis, and machine learning.
Nat Biotechnol (2021). https://doi.org/10.1038/s41587-021-00938-z

Download citation

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *