Readfish enables targeted nanopore sequencing of gigabase-sized genomes thumbnail

Readfish enables targeted nanopore sequencing of gigabase-sized genomes

Abstract

Nanopore sequencers can be used to selectively sequence certain DNA molecules in a pool by reversing the voltage across individual nanopores to reject specific sequences, enabling enrichment and depletion to address biological questions. Previously, we achieved this using dynamic time warping to map the signal to a reference genome, but the method required substantial computational resources and did not scale to gigabase-sized references. Here we overcome this limitation by using graphical processing unit (GPU) base-calling. We show enrichment of specific chromosomes from the human genome and of low-abundance organisms in mixed populations without a priori knowledge of sample composition. Finally, we enrich targeted panels comprising 25,600 exons from 10,000 human genes and 717 genes implicated in cancer, identifying PMLRARA fusions in the NB4 cell line in <15 h sequencing. These methods can be used to efficiently screen any target panel of genes without specialized sample preparation using any computer and a suitable GPU. Our toolkit, readfish, is available at https://www.github.com/looselab/readfish.

Data availability

All reads generated in the course of this study are available from the ENA under project ID PRJEB36644.

Code availability

Our code is available open source at http://www.github.com/LooseLab/readfish. See also “readfish code availability” above.

References

  1. 1.

    Loose, M., Malla, S. & Stout, M. Real-time selective sequencing using nanopore technology. Nat. Methods 13, 751–754 (2016).

    CAS 
    Article 

    Google Scholar
     

  2. 2.

    Masutani, B. & Morishita, S. A framework and an algorithm to detect low-abundance DNA by a handy sequencer and a palm-sized computer. Bioinformatics 35, 584–592 (2019).

    CAS 
    Article 

    Google Scholar
     

  3. 3.

    Kovaka, S., Fan, Y., Ni, B., Timp, W. & Schatz, M. C. Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED. Nat. Biotechnol. https://doi.org/10.1038/s41587-020-0731-9 (2020).

  4. 4.

    Edwards, H. S. et al. Real-time selective sequencing with RUBRIC: Read Until with Basecall and Reference-Informed Criteria. Sci. Rep. 9, 11475 (2019).

    Article 

    Google Scholar
     

  5. 5.

    Rang, F. J., Kloosterman, W. P. & de Ridder, J. From squiggle to basepair: computational approaches for improving nanopore sequencing read accuracy. Genome Biol. 19, 90 (2018).

    Article 

    Google Scholar
     

  6. 6.

    Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    CAS 
    Article 

    Google Scholar
     

  7. 7.

    Kim, D., Song, L., Breitwieser, F. P. & Salzberg, S. L. Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome Res. 26, 1721–1729 (2016).

    CAS 
    Article 

    Google Scholar
     

  8. 8.

    Tate, J. G. et al. COSMIC: the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res. 47, D941–D947 (2019).

    CAS 
    Article 

    Google Scholar
     

  9. 9.

    Mozziconacci, M.-J. et al. Molecular cytogenetics of the acute promyelocytic leukemia-derived cell line NB4 and of four all-trans retinoic acid–resistant subclones. Genes Chromosomes Cancer 35, 261–270 (2002).

    CAS 
    Article 

    Google Scholar
     

  10. 10.

    Jain, M. et al. Nanopore sequencing and assembly of a human genome with ultra-long reads. Nat. Biotechnol. 36, 338–345 (2018).

    CAS 
    Article 

    Google Scholar
     

  11. 11.

    Charalampous, T. et al. Nanopore metagenomics enables rapid clinical diagnosis of bacterial lower respiratory infection. Nat. Biotechnol. 37, 783–792 (2019).

    CAS 
    Article 

    Google Scholar
     

  12. 12.

    Marotz, C. A. et al. Improving saliva shotgun metagenomics by chemical host DNA depletion. Microbiome 6, 42 (2018).

    Article 

    Google Scholar
     

  13. 13.

    Nicholls, S. M., Quick, J. C., Tang, S. & Loman, N. J. Ultra-deep, long-read nanopore sequencing of mock microbial community standards. Gigascience 8, giz043 (2019).

    Article 

    Google Scholar
     

  14. 14.

    Kolmogorov, M., Yuan, J., Lin, Y. & Pevzner, P. A. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 37, 540–546 (2019).

    CAS 
    Article 

    Google Scholar
     

  15. 15.

    Kozarewa, I., Armisen, J., Gardner, A. F., Slatko, B. E. & Hendrickson, C. L. Overview of target enrichment strategies. Curr. Protoc. Mol. Biol. 112, 7.21.1–7.21.23 (2015).

    Article 

    Google Scholar
     

  16. 16.

    Gnirke, A. et al. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat. Biotechnol. 27, 182–189 (2009).

    CAS 
    Article 

    Google Scholar
     

  17. 17.

    Gilpatrick, T. et al. Targeted nanopore sequencing with Cas9-guided adapter ligation. Nat. Biotechnol. 38, 433–438 (2020).

    CAS 
    Article 

    Google Scholar
     

  18. 18.

    Loose, M. Finding the needle: targeted nanopore sequencing and CRISPR-Cas9. CRISPR J. 1, 265–267 (2018).

    Article 

    Google Scholar
     

  19. 19.

    Cunningham, F. et al. Ensembl 2019. Nucleic Acids Res. 47, D745–D751 (2019).

    CAS 
    Article 

    Google Scholar
     

  20. 20.

    Heller, D. & Vingron, M. SVIM: structural variant identification using mapped long reads. Bioinformatics 35, 2907–2915 (2019).

    CAS 
    Article 

    Google Scholar
     

  21. 21.

    Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Meth. 15, 461–468 (2018).

    CAS 
    Article 

    Google Scholar
     

  22. 22.

    Beyter, D., Ingimundardottir, H. & Eggertsson, H. P. Long read sequencing of 1,817 Icelanders provides insight into the role of structural variants in human disease. Preprint at bioRxiv https://doi.org/10.1101/848366 (2019).

  23. 23.

    Pedersen, B. S. & Quinlan, A. R. Mosdepth: quick coverage calculation for genomes and exomes. Bioinformatics 34, 867–868 (2018).

    CAS 
    Article 

    Google Scholar
     

  24. 24.

    Zook, J. M. et al. An open resource for accurately benchmarking small variant and reference calls. Nat. Biotechnol. 37, 561–566 (2019).

    CAS 
    Article 

    Google Scholar
     

  25. 25.

    Jeffares, D. C. et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat. Commun. 8, 14061 (2017).

    CAS 
    Article 

    Google Scholar
     

  26. 26.

    Nattestad, M., Aboukhalil, R., Chin, C.-S. & Schatz, M. C. Ribbon: intuitive visualization for complex genomic variation. Bioinformatics https://doi.org/10.1093/bioinformatics/btaa680 (2020).

  27. 27.

    Pruitt, K. D. & Maglott, D. R. RefSeq and LocusLink: NCBI gene-centered resources. Nucleic Acids Res. 29, 137–140 (2001).

    CAS 
    Article 

    Google Scholar
     

Download references

Acknowledgements

We thank J. Quick, J. Tyson, J. Simpson and N. Loman for helpful comments and (mainly) criticisms and E. Birney, N. Goldman and A. Senf for helpful insights and discussion on these approaches. We thank M. Hubank and L. Gallagher for access to materials and reagents as well as general boundless enthusiasm. We thank M. Jain for assisting in manipulating data. We also thank S. Reid, C. Wright, C. Seymour, J. Pugh and G. Pimm from ONT for advice on MinKNOW and Guppy operations as well as extensive troubleshooting. This work was supported by the Biotechnology and Biological Sciences Research Council (grant numbers BB/N017099/1, R.M. and M.L.; BB/M020061/1, M.L.; and BB/M008770/1, 1949454 A.P.), the Wellcome Trust (grant number 204843/Z/16/Z, N.H. and M.L.) and the Defence Science and Technology Laboratory (grant number DSTLX-1000138444, R.M. and M.L.).

Author information

Affiliations

  1. DeepSeq, School of Life Sciences, Queens Medical Centre, University of Nottingham, Nottingham, UK

    Alexander Payne, Nadine Holmes, Thomas Clarke, Rory Munro, Bisrat J. Debebe & Matthew Loose

Contributions

M.L. and A.P. conceived the study. A.P., N.H. and M.L. acquired data. T.C. and R.M. designed and implemented metagenomics applications. A.P., B.J.D. and M.L. analyzed and interpreted data. All authors discussed the results and contributed to the final manuscript.

Corresponding author

Correspondence to
Matthew Loose.

Ethics declarations

Competing interests

M.L. was a member of the MinION access program and has received free flow cells and sequencing reagents in the past. M.L. has received reimbursement for travel, accommodation and conference fees to speak at events organized by ONT.

Additional information

Peer review information Nature Biotechnology thanks Jan Korbel and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

About this article

Verify currency and authenticity via CrossMark

Cite this article

Payne, A., Holmes, N., Clarke, T. et al. Readfish enables targeted nanopore sequencing of gigabase-sized genomes.
Nat Biotechnol (2020). https://doi.org/10.1038/s41587-020-00746-x

Download citation

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *