Nanopore sequencers can be used to selectively sequence certain DNA molecules in a pool by reversing the voltage across individual nanopores to reject specific sequences, enabling enrichment and depletion to address biological questions. Previously, we achieved this using dynamic time warping to map the signal to a reference genome, but the method required substantial computational resources and did not scale to gigabase-sized references. Here we overcome this limitation by using graphical processing unit (GPU) base-calling. We show enrichment of specific chromosomes from the human genome and of low-abundance organisms in mixed populations without a priori knowledge of sample composition. Finally, we enrich targeted panels comprising 25,600 exons from 10,000 human genes and 717 genes implicated in cancer, identifying PML–RARA fusions in the NB4 cell line in <15 h sequencing. These methods can be used to efficiently screen any target panel of genes without specialized sample preparation using any computer and a suitable GPU. Our toolkit, readfish, is available at https://www.github.com/looselab/readfish.
All reads generated in the course of this study are available from the ENA under project ID PRJEB36644.
Our code is available open source at http://www.github.com/LooseLab/readfish. See also “readfish code availability” above.
Loose, M., Malla, S. & Stout, M. Real-time selective sequencing using nanopore technology. Nat. Methods 13, 751–754 (2016).
Masutani, B. & Morishita, S. A framework and an algorithm to detect low-abundance DNA by a handy sequencer and a palm-sized computer. Bioinformatics 35, 584–592 (2019).
Kovaka, S., Fan, Y., Ni, B., Timp, W. & Schatz, M. C. Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED. Nat. Biotechnol. https://doi.org/10.1038/s41587-020-0731-9 (2020).
Edwards, H. S. et al. Real-time selective sequencing with RUBRIC: Read Until with Basecall and Reference-Informed Criteria. Sci. Rep. 9, 11475 (2019).
Rang, F. J., Kloosterman, W. P. & de Ridder, J. From squiggle to basepair: computational approaches for improving nanopore sequencing read accuracy. Genome Biol. 19, 90 (2018).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Kim, D., Song, L., Breitwieser, F. P. & Salzberg, S. L. Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome Res. 26, 1721–1729 (2016).
Tate, J. G. et al. COSMIC: the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res. 47, D941–D947 (2019).
Mozziconacci, M.-J. et al. Molecular cytogenetics of the acute promyelocytic leukemia-derived cell line NB4 and of four all-trans retinoic acid–resistant subclones. Genes Chromosomes Cancer 35, 261–270 (2002).
Jain, M. et al. Nanopore sequencing and assembly of a human genome with ultra-long reads. Nat. Biotechnol. 36, 338–345 (2018).
Charalampous, T. et al. Nanopore metagenomics enables rapid clinical diagnosis of bacterial lower respiratory infection. Nat. Biotechnol. 37, 783–792 (2019).
Marotz, C. A. et al. Improving saliva shotgun metagenomics by chemical host DNA depletion. Microbiome 6, 42 (2018).
Nicholls, S. M., Quick, J. C., Tang, S. & Loman, N. J. Ultra-deep, long-read nanopore sequencing of mock microbial community standards. Gigascience 8, giz043 (2019).
Kolmogorov, M., Yuan, J., Lin, Y. & Pevzner, P. A. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 37, 540–546 (2019).
Kozarewa, I., Armisen, J., Gardner, A. F., Slatko, B. E. & Hendrickson, C. L. Overview of target enrichment strategies. Curr. Protoc. Mol. Biol. 112, 7.21.1–7.21.23 (2015).
Gnirke, A. et al. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat. Biotechnol. 27, 182–189 (2009).
Gilpatrick, T. et al. Targeted nanopore sequencing with Cas9-guided adapter ligation. Nat. Biotechnol. 38, 433–438 (2020).
Loose, M. Finding the needle: targeted nanopore sequencing and CRISPR-Cas9. CRISPR J. 1, 265–267 (2018).
Cunningham, F. et al. Ensembl 2019. Nucleic Acids Res. 47, D745–D751 (2019).
Heller, D. & Vingron, M. SVIM: structural variant identification using mapped long reads. Bioinformatics 35, 2907–2915 (2019).
Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Meth. 15, 461–468 (2018).
Beyter, D., Ingimundardottir, H. & Eggertsson, H. P. Long read sequencing of 1,817 Icelanders provides insight into the role of structural variants in human disease. Preprint at bioRxiv https://doi.org/10.1101/848366 (2019).
Pedersen, B. S. & Quinlan, A. R. Mosdepth: quick coverage calculation for genomes and exomes. Bioinformatics 34, 867–868 (2018).
Zook, J. M. et al. An open resource for accurately benchmarking small variant and reference calls. Nat. Biotechnol. 37, 561–566 (2019).
Jeffares, D. C. et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat. Commun. 8, 14061 (2017).
Nattestad, M., Aboukhalil, R., Chin, C.-S. & Schatz, M. C. Ribbon: intuitive visualization for complex genomic variation. Bioinformatics https://doi.org/10.1093/bioinformatics/btaa680 (2020).
Pruitt, K. D. & Maglott, D. R. RefSeq and LocusLink: NCBI gene-centered resources. Nucleic Acids Res. 29, 137–140 (2001).
We thank J. Quick, J. Tyson, J. Simpson and N. Loman for helpful comments and (mainly) criticisms and E. Birney, N. Goldman and A. Senf for helpful insights and discussion on these approaches. We thank M. Hubank and L. Gallagher for access to materials and reagents as well as general boundless enthusiasm. We thank M. Jain for assisting in manipulating data. We also thank S. Reid, C. Wright, C. Seymour, J. Pugh and G. Pimm from ONT for advice on MinKNOW and Guppy operations as well as extensive troubleshooting. This work was supported by the Biotechnology and Biological Sciences Research Council (grant numbers BB/N017099/1, R.M. and M.L.; BB/M020061/1, M.L.; and BB/M008770/1, 1949454 A.P.), the Wellcome Trust (grant number 204843/Z/16/Z, N.H. and M.L.) and the Defence Science and Technology Laboratory (grant number DSTLX-1000138444, R.M. and M.L.).
M.L. was a member of the MinION access program and has received free flow cells and sequencing reagents in the past. M.L. has received reimbursement for travel, accommodation and conference fees to speak at events organized by ONT.
Peer review information Nature Biotechnology thanks Jan Korbel and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Payne, A., Holmes, N., Clarke, T. et al. Readfish enables targeted nanopore sequencing of gigabase-sized genomes.
Nat Biotechnol (2020). https://doi.org/10.1038/s41587-020-00746-x