Extended-representation bisulfite sequencing of gene regulatory elements in multiplexed samples and single cells thumbnail

Extended-representation bisulfite sequencing of gene regulatory elements in multiplexed samples and single cells

Abstract

The biological roles of DNA methylation have been elucidated by profiling methods based on whole-genome or reduced-representation bisulfite sequencing, but these approaches do not efficiently survey the vast numbers of non-coding regulatory elements in mammalian genomes. Here we present an extended-representation bisulfite sequencing (XRBS) method for targeted profiling of DNA methylation. Our design strikes a balance between expanding coverage of regulatory elements and reproducibly enriching informative CpG dinucleotides in promoters, enhancers and CTCF binding sites. Barcoded DNA fragments are pooled before bisulfite conversion, allowing multiplex processing and technical consistency in low-input samples. Application of XRBS to single leukemia cells enabled us to evaluate genetic copy number variations and methylation variability across individual cells. Our analysis highlights heterochromatic H3K9me3 regions as having the highest cell-to-cell variability in their methylation, likely reflecting inherent epigenetic instability of these late-replicating regions, compounded by differences in cell cycle stages among sampled cells.

Access options

Subscribe to Journal

Get full journal access for 1 year

$59.00

only $4.92 per issue

All prices are NET prices.

VAT will be added later in the checkout.

Tax calculation will be finalised during checkout.

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Data availability

The datasets generated during this study have been deposited in the Gene Expression Omnibus with primary accession code GSE149954.

References

  1. 1.

    Luo, C., Hajkova, P. & Ecker, J. R. Dynamic DNA methylation: in the right place at the right time. Science 361, 1336–1340 (2018).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  2. 2.

    Yoder, J. A., Walsh, C. P. & Bestor, T. H. Cytosine methylation and the ecology of intragenomic parasites. Trends Genet. 13, 335–340 (1997).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  3. 3.

    Greenberg, M. V. C. & Bourc’his, D. The diverse roles of DNA methylation in mammalian development and disease. Nat. Rev. Mol. Cell Biol. 20, 590–607 (2019).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  4. 4.

    Deaton, A. M. & Bird, A. CpG islands and the regulation of transcription. Genes Dev. 25, 1010–1022 (2011).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  5. 5.

    Schübeler, D. Function and information content of DNA methylation. Nature 517, 321–326 (2015).

    PubMed 
    Article 
    CAS 

    Google Scholar
     

  6. 6.

    Baylin, S. B. & Jones, P. A. A decade of exploring the cancer epigenome – biological and translational implications. Nat. Rev. Cancer 11, 726–734 (2011).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  7. 7.

    Zemach, A., McDaniel, I. E., Silva, P. & Zilberman, D. Genome-wide evolutionary analysis of eukaryotic DNA methylation. Science 328, 916–919 (2010).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  8. 8.

    Yin, Y. et al. Impact of cytosine methylation on DNA binding specificities of human transcription factors. Science 356, eaaj2239 (2017).

    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  9. 9.

    Song, Y. et al. Dynamic enhancer DNA methylation as basis for transcriptional and cellular heterogeneity of ESCs. Mol. Cell 75, 905–920 (2019).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  10. 10.

    Flavahan, W. A. et al. Insulator dysfunction and oncogene activation in IDH mutant gliomas. Nature 529, 110–114 (2016).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  11. 11.

    Gu, H. et al. Preparation of reduced representation bisulfite sequencing libraries for genome-scale DNA methylation profiling. Nat. Protoc. 6, 468–481 (2011).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  12. 12.

    Meissner, A. et al. Genome-scale DNA methylation maps of pluripotent and differentiated cells. Nature 454, 766–770 (2008).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  13. 13.

    Boyle, P. et al. Gel-free multiplexed reduced representation bisulfite sequencing for large-scale DNA methylation profiling. Genome Biol. 13, 1–10 (2012).

    Article 
    CAS 

    Google Scholar
     

  14. 14.

    Akalin, A. et al. Base-pair resolution DNA methylation sequencing reveals profoundly divergent epigenetic landscapes in acute myeloid leukemia. PLoS Genet. 8, e1002781 (2012).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  15. 15.

    Garrett-Bakelman, F. E. et al. Enhanced reduced representation bisulfite sequencing for assessment of DNA methylation at base pair resolution. J. Vis. Exp. e52246 (2015).

  16. 16.

    Li, G. et al. Joint profiling of DNA methylation and chromatin architecture in single cells. Nat. Methods 16, 991–993 (2019).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  17. 17.

    Tanaka, K. & Okamoto, A. Degradation of DNA by bisulfite treatment. Bioorg. Med. Chem. Lett. 17, 1912–1915 (2007).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  18. 18.

    Kint, S., De Spiegelaere, W., De Kesel, J., Vandekerckhove, L. & Van Criekinge, W. Evaluation of bisulfite kits for DNA methylation profiling in terms of DNA fragmentation and DNA recovery using digital PCR. PLoS ONE 13, e0199091 (2018).

    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  19. 19.

    Ben-Hattar, J. & Jiricny, J. Methylation of single CpG dinucleotides within a promoter element of the Herpes simplex virus tk gene reduces its transcription in vivo. Gene 65, 219–227 (1988).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  20. 20.

    Schübeler, D. Function and information content of DNA methylation. Nature 517, 321–326 (2015).

    PubMed 
    Article 
    CAS 

    Google Scholar
     

  21. 21.

    Rao, S. S. P. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  22. 22.

    Liu, X. S. et al. Editing DNA methylation in the mammalian genome. Cell 167, 233–247 (2016).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  23. 23.

    Fu, Y., Sinha, M., Peterson, C. L. & Weng, Z. The insulator binding protein CTCF positions 20 nucleosomes around its binding sites across the human genome. PLoS Genet. 4, e1000138 (2008).

    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  24. 24.

    Schuyler, R. P. et al. Distinct trends of DNA methylation patterning in the innate and adaptive immune systems. Cell Rep. 17, 2101–2111 (2016).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  25. 25.

    Wiehle, L. et al. DNA (de)methylation in embryonic stem cells controls CTCF-dependent chromatin boundaries. Genome Res. 29, 750–761 (2019).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  26. 26.

    Li, Y. et al. Alterations of specific chromatin conformation affect ATRA-induced leukemia cell differentiation. Cell Death Dis. 9, 200 (2018).

    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  27. 27.

    Varley, K. E. et al. Dynamic DNA methylation across diverse human cell lines and tissues. Genome Res. 23, 555–567 (2013).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  28. 28.

    Zhou, W. et al. DNA methylation loss in late-replicating domains is linked to mitotic cell division. Nat. Genet. 50, 591–602 (2018).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  29. 29.

    Michalowsky, L. A. & Jones, P. A. Differential nuclear protein binding to 5-azacytosine-containing DNA as a potential mechanism for 5-aza-2′-deoxycytidine resistance. Mol. Cell Biol. 7, 3076–3083 (1987).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  30. 30.

    Aran, D. & Hellman, A. DNA methylation of transcriptional enhancers and cancer predisposition. Cell 154, 11–13 (2013).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  31. 31.

    Aran, D., Sabato, S. & Hellman, A. DNA methylation of distal regulatory sites characterizes dysregulation of cancer genes. Genome Biol. 14, R21 (2013).

    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  32. 32.

    Creyghton, M. P. et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl Acad. Sci. USA 107, 21931–21936 (2010).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  33. 33.

    Bell, A. C. & Felsenfeld, G. Methylation of a CTCF-dependent boundary controls imprinted expression of the Igf2 gene. Nature 405, 482–485 (2000).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  34. 34.

    Hark, A. T. et al. CTCF mediates methylation-sensitive enhancer-blocking activity at the H19/Igf2 locus. Nature 405, 486–489 (2000).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  35. 35.

    Moore, J. E. et al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 583, 699–710 (2020).

    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  36. 36.

    Corces, M. R. et al. Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution. Nat. Genet. 48, 1193–1203 (2016).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  37. 37.

    Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  38. 38.

    McGahon, A. J. et al. Downregulation of Bcr-Abl in K562 cells restores susceptibility to apoptosis: characterization of the apoptotic death. Cell Death Differ. 4, 95–104 (1997).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  39. 39.

    Charlton, J. et al. Global delay in nascent strand DNA methylation. Nat. Struct. Mol. Biol. 25, 327–332 (2018).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  40. 40.

    Hansen, R. S. et al. Sequencing newly replicated DNA reveals widespread plasticity in human replication timing. Proc. Natl Acad. Sci. USA 107, 139–144 (2010).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  41. 41.

    Angeloni, A. & Bogdanovic, O. Enhancer DNA methylation: implications for gene regulation. Essays Biochem. 63, 707–715 (2019).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  42. 42.

    Mulqueen, R. M. et al. Highly scalable generation of DNA methylation profiles in single cells. Nat. Biotechnol. 36, 428–431 (2018).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  43. 43.

    Luo, C. et al. Robust single-cell DNA methylome profiling with snmC-seq2. Nat. Commun. 9, 3824 (2018).

    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  44. 44.

    Luo, C. et al. Single-cell methylomes identify neuronal subtypes and regulatory elements in mammalian cortex. Science 357, 600–604 (2017).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  45. 45.

    Hou, Y. et al. Single-cell triple omics sequencing reveals genetic, epigenetic, and transcriptomic heterogeneity in hepatocellular carcinomas. Cell Res. 26, 304–319 (2016).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  46. 46.

    ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Article 
    CAS 

    Google Scholar
     

  47. 47.

    Kacmarczyk, T. J. et al. ‘Same difference’: comprehensive evaluation of four DNA methylation measurement platforms. Epigenetics Chromatin 11, 21 (2018).

    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  48. 48.

    Chen, C. et al. Single-cell whole-genome analyses by Linear Amplification via Transposon Insertion (LIANTI). Science 356, 189–194 (2017).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  49. 49.

    Hovestadt, V. et al. Decoding the regulatory landscape of medulloblastoma using DNA methylation sequencing. Nature 510, 537–541 (2014).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  50. 50.

    Guo, F. et al. Active and passive demethylation of male and female pronuclear DNA in the mammalian zygote. Cell Stem Cell 15, 447–459 (2014).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  51. 51.

    Gaiti, F. et al. Epigenetic evolution and lineage histories of chronic lymphocytic leukaemia. Nature 569, 576–580 (2019).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  52. 52.

    Ghandi, M. et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature 569, 503–508 (2019).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

Download references

Acknowledgements

B.E.B. is the Bernard and Mildred Kayden Endowed MGH Research Institute Chair and an American Cancer Society Research Professor. This research was supported by a Pioneer Award from the National Institutes of Health Common Fund and National Cancer Institute (DP1CA216873) and by the Gene Regulation Observatory at the Broad Institute. S.J.S. was supported by a Medical Scientist Training Award from the National Institute of General Medical Sciences (T32GM007753). V.H. was supported by a Human Frontier Science Program long-term fellowship (LT000596/2016-L). We thank R. Boursiquot for technical assistance and L. Gaskell, W. Flavahan and other Bernstein lab members for helpful discussions.

Author information

Affiliations

  1. Department of Pathology and Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA

    Sarah J. Shareef, Samantha M. Bevill, Ayush T. Raman, Martin J. Aryee, Peter van Galen, Volker Hovestadt & Bradley E. Bernstein

  2. Broad Institute of Harvard and MIT, Cambridge, MA, USA

    Sarah J. Shareef, Samantha M. Bevill, Ayush T. Raman, Martin J. Aryee, Peter van Galen, Volker Hovestadt & Bradley E. Bernstein

  3. Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA

    Martin J. Aryee

  4. Division of Hematology, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA

    Peter van Galen

  5. Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA, USA

    Volker Hovestadt

  6. Division of Hematology/Oncology, Boston Children’s Hospital, Boston, MA, USA

    Volker Hovestadt

Contributions

S.J.S., P.v.G, V.H. and B.E.B. conceptualized and designed experiments. S.J.S. optimized and executed XRBS experiments. S.J.S. and S.M.B. performed decitabine experiments. V.H. performed computational analyses. A.T.R. contributed to computational analyses of single-cell data. M.J.A., V.H. and B.E.B. provided senior guidance. S.J.S., V.H. and B.E.B. wrote the manuscript with assistance from other authors, all of whom approved the final submission.

Corresponding authors

Correspondence to
Volker Hovestadt or Bradley E. Bernstein.

Ethics declarations

Competing interests

B.E.B. discloses financial interests in Fulcrum Therapeutics, HiFiBio, Arsenal Biosciences, Cell Signaling Technologies and BioMillenia. The remaining authors declare no competing financial interests.

Additional information

Peer review information Nature Biotechnology thanks Jonas Demeulemeester and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Evaluation of single MspI anchor design for methyl-CpG profiling.

a, Plots show results from an in silico MspI restriction digest analysis of the human genome. The cumulative number of MspI fragments (total of 2.3 million, left), of basepairs (total of 3.1 billion, middle), and of CpGs (total of 29.4 million, right) is shown relative to increasing MspI fragment length. Vertical dotted lines show the size range of fragments captured in typical RRBS experiments. This analysis shows that RRBS of MspI fragments 40–120 bases in length covers only 0.9% of the genome, but enriches for 5.6% of genomic CpGs. Recent implementations of RRBS (for example enhanced RRBS14,15) that consider fragments up to 320 bases in length cover an additional 9.7% of CpGs. Approximately 35.0% of CpGs that are located within 300 bases of a single MspI site are not captured by these techniques. b, Histogram shows coverage depth of MspI restriction sites for individual replicates of a 10 ng XRBS library (left, middle), and both replicates combined (right). c, Heatmap shows coverage depth of CpGs between replicates of a 10 ng XRBS library (Pearson’s r = 0.90). d, Histogram shows coverage depth of CpGs in the combined dataset of both replicates (n = 2 independently generated libraries). e, Plot shows unique reads as a function of aligned reads in millions. With greater sequencing depth the fraction of unique reads decreases, as the chance of sampling a non-unique read (that is PCR duplicate) increases.

Extended Data Fig. 2 Comparison of MspI fragment detection and CpG coverage over regulatory elements.

a, Plot shows number of detected fragments plotted as a function of calculated MspI fragment length from XRBS 10 ng library replicates and from public RRBS and enhanced RRBS (ERRBS) datasets. Because of the random hexamer-primed second strand elongation step, XRBS efficiently detects fragments that exceed the selected fragment size range in RRBS (40–220 bp) and ERRBS (70–320 bp). XRBS less efficiently captures short fragments (<70 bp) compared to ERRBS and RRBS. Peaks in the graph correspond to three fragments commonly generated from Alu repetitive elements. b, Plot compares CpG coverage as a function of sequencing depth (x-axis) for XRBS (red), WGBS (blue, ENCODE), ERRBS (orange, obtained from ref. 47) and RRBS (green, ENCODE). c, Downsampling analysis plot as in panel B but restricted to CpGs within CpG islands (top) and gene promoters (bottom). d, Downsampling analysis plot as in panel B but restricted to CpGs within H3K27ac peaks (top) and CTCF binding sites (bottom).

Extended Data Fig. 3 XRBS efficiently covers CpGs in regulatory elements and repetitive regions.

a, Plots show the number of proximal enhancer-like, distal enhancer-like, and CTCF-only elements (as defined in the ENCODE SCREEN database35) with at least 25-fold combined coverage as a function of sequencing depth for XRBS (red), WGBS (blue), ERRBS (orange), and RRBS (green). Enrichment for functional elements at a uniform sequencing depth of 10 billion base pairs is indicated. Vertical grey line indicates break in x-axis scale. b, Heatmaps depict genomic regions (rows, n = 3,725,365 LTRs, SINEs, and LINEs) containing different repeat element families (as defined by RepeatMasker). Individual repeat elements are divided into 50 equally sized windows (5’ and 3’ position indicated). Upstream and downstream regions (±200 bp) are divided into 25 equally sized windows. Panels from left to right show DNA methylation calls from 450k methylation array, RRBS, ERRBS, XRBS, and WGBS. c, Plots compare CpG coverage within different repeat element families as a function of sequencing depth for XRBS (red), WGBS (blue), ERRBS (orange), and RRBS (green). CpG enrichment relative to WGBS is indicated. In comparison to RRBS, XRBS enriches for most repeat families, with the exception of Alu and ERV1 elements that frequently contain MspI restriction sites and are also efficiently captured (see also Extended Data Fig. 2a).

Extended Data Fig. 4 Correlation of DNA methylation with histone marks and compartment calls.

a, Plot shows unique reads as a function of aligned reads in low-coverage XRBS libraries from K562, HL-60, OCI-AML3, and Kasumi-1 cells. b, Plot shows unique reads as a function of aligned reads in low-coverage libraries from K562, Kasumi-1, HL-60, OCI-AML3 cells. Libraries were generated from 1,000 (green) and 100 (orange) cells sorted directly into lysis buffer. Libraries generated from 1,000 cells are comparable to libraries generated from 10 ng of purified DNA (panel A), whereas 100 cell libraries show reduced complexity. c, Heatmap shows Pearson correlation of XRBS methylation profiles of 100 kb windows generated from 10 ng gDNA, 1,000 or 100 sorted cells across four cell lines. Dendrogram derived from unsupervised clustering is indicated to the left. Sample grouping by DNA methylation is consistent with cell identity, indicating low technical variability between input material. d, Heatmaps show correlation between average DNA methylation values and signal for H3K9me3 (left), H3K27me3 (center), and H3K36me3 (right) in 100kb-windows for K562 cells. e, Heatmap shows correlation between DNA methylation and the Hi-C-derived first eigenvector indicating compartment A (positive values) and compartment B (negative values) in 100kb-windows for K562 cells. f, Heatmap shows correlation between average DNA methylation values and ChIP seq signal for H3K9me3 (top), H3K27me3 (middle), and H3K36me3 (bottom) in 100kb-windows for human H1 embryonic stem cells, primary T cells and mammary epithelial cells, and cultured IMR90, GM12878 and K562 cells. g, Heatmap as in panel F, but shows correlation between average DNA methylation values and the Hi-C-derived first eigenvector (x-axis). Positive values correspond to compartment A and negative values correspond to compartment B. While hypomethylation of compartment B is most pronounced in K562 cells, a similar trend is also observed in other cultured cell lines and in primary mammary epithelial cells.

Extended Data Fig. 5 Characterization of decitabine treatment of cancer cell lines.

a, Plot shows dose response curve for decitabine treatment of three cell lines Kasumi, HL-60, and OCI-AML3. Viability was measured using cell titer glo and is reported as luminescence relative to control DMSO treated cells (n = 3 independently treated replicates, error bars represent standard deviation). b, Images show HL60 and OCI-AML3 cells treated with 300 nM decitabine and a DMSO vehicle control. Morphology of decitabine treated cells similar to control, repeated three times. Scale bar is indicated and applies to all images. c, Plot shows unique reads as a function of aligned reads in XRBS libraries from DMSO- and decitabine-treated HL-60 and OCI-AML3 cells. d, Barplot shows average DNA methylation values across island (dark grey) and non-island (light grey) CpGs in DMSO- and decitabine-treated HL-60 and OCI-AML3 cells. For example, average methylation of non-island CpGs in HL-60 cells is reduced from 0.68 to 0.54 by decitabine treatment (20.2% reduction, n = 1 library per treatment). e, Heatmap shows correlation between Hi-C-derived first eigenvectors from K562 and HL-60 cell lines in 100kb-windows, indicating high agreement in compartment structure between both cell lines. f, Heatmaps show correlation between average DNA methylation values and Hi-C-derived eigenvector in 100kb-windows for DMSO- (left) and decitabine-treated HL-60 cells (center). Heatmap on the right shows relative DNA methylation values of decitabine- and DMSO-treated cells. Despite compartment B showing lower methylation compared to compartment A at baseline, induced DNA hypomethylation with decitabine treatment affects compartment A and B equally.

Extended Data Fig. 6 Differential DNA methylation of gene promoters.

a, Plot shows unique reads as a function of aligned reads in 1,000 cell high-coverage libraries of four cell lines. b, Heatmap depicts 8 kb regions (rows, n = 3,972 promoters) centered at transcription start sites that show cell line-specific hyper- or hypomethylation (as in Fig. 4a) and divided into 100 equally sized windows. Panels from left to right show methylation calls from 450k methylation array, RRBS, XRBS, and WGBS in K562 cells. All datasets except XRBS were retrieved from ENCODE46. c, Plot shows expression levels for genes that were associated with cell line-specific promoter hyper- and hypo-methylation. Genes with an expression level larger than 0.5 are considered as expressed. Average gene expression levels are indicated by horizontal lines. P-values were generated using a two-sided Mann-Whitney U test. In K562, the majority (74.0%) of hypomethylated promoters are associated with non-expressed genes, which is unique to this cell line, consistent with global hypo-methylation in K562. d, Scatterplot compares gene expression level and H3K4me3 ChIP-seq signal for gene promoters that were identified as differentially methylated between all four cell lines. Individual promoters (dots) are colored if specifically hypermethylated (red) and hypermethylated (blue) in K562 cells. This analysis shows that promoters which are not expressed and specifically hypomethylated in K562 (n = 1,624 promoters) are generally negative for the H3K4me3 histone mark (98.7%), whereas promoters that are hypomethylated and expressed (n = 570) are more frequently positive for H3K4me3 (45.0%).

Extended Data Fig. 7 Evaluating the use of XRBS DNA methylation profiling to predict H3K27 acetylation and CTCF binding.

a, Heatmap depicts 8 kb regions (rows, n = 15,202 peaks) centered on H3K27ac peaks, grouped into regions that are hypomethylated specifically in K562 or OCI-AML3 cells (as in Fig. 4b). Peaks that are not specifically hypomethylated (‘Others’) are downsampled for visualization. Regions are divided into 100 equally sized windows. Panels from left to right show: methylation calls from 450k methylation array, RRBS, XRBS, and WGBS in K562 cells. All datasets except XRBS were retrieved from ENCODE46. b, Scatterplot shows merged H3K27ac peaks from OCI-AML3 and K562 ChIP-seq datasets. Individual peaks (dots) are colored if specifically hypomethylated in K562 (blue) or OCI-AML3 (red) cells. c, Line plot (bottom) shows difference in methylation between K562 and OCI-AML3 cells over merged H3K27ac peaks (n = 15,202 peaks). Of these peaks, 8.3% and 2.3% are specifically hypomethylated in K562 (methylation difference ≤ −0.5) and OCI-AML3 ( ≥ 0.5) cells, respectively. Bar plot (top) shows the fraction of cell line-specific H3K27ac peaks within 100 equally sized bins grouped by difference in methylation. Shared peaks are indicated in gray. d, Receiver operating characteristic (ROC) curve shows performance of predicting cell line-specific H3K27ac peaks based on difference in DNA methylation over peaks that are covered by XRBS. Sensitivity and specificity are indicated at different thresholds (±0.2 and ±0.5, as in panel C). e, Heatmap depicts 2 kb regions (rows, n = 7,629 peaks) centered at merged CTCF peaks from K562 and HL-60 Chip-seq datasets. Individual peaks (dots) are colored if specifically hypermethylated in K562 or HL-60 cells (as in Fig. 4c). Peaks not specifically hypermethylated (‘Others’) are downsampled for visualization. Panels from left to right show methylation calls from 450k methylation arrays, RRBS, XRBS, and WGBS in K562 cells. All datasets except XRBS were retrieved from ENCODE46. f, Scatterplot shows merged CTCF peaks from K562 and HL-60 ChIP-seq datasets. Individual CTCF binding sites (dots) are colored if specifically hypermethylated in K562 (red) or HL-60 (blue) cells. g, Line plot (bottom) shows difference in methylation between K562 and HL-60 cells over merged CTCF peaks (n = 7,629 peaks). Bar plot (top) shows the fraction of cell line-specific CTCF peaks within 100 equally sized bins grouped by difference in methylation. Shared peaks are indicated in gray. h, ROC curve shows performance of predicting cell line-specific CTCF peaks based on difference in DNA methylation over peaks that are covered by XRBS. Sensitivity and specificity are indicated at different thresholds (±0.2 and ±0.5, as in panel G).

Extended Data Fig. 8 XRBS profiling of limited human bone marrow cell types.

a, Plots show the gating strategy for fluorescence assisted cell sorting (FACS) of human bone marrow of CD34 + HSPCs, CD3 + T cells, and CD14 + monocytes. Singlets (FSC-W vs. -H) and viable cells (PI vs. FSC-A) were sorted based on cell surface marker signal. b, Plot shows unique reads as a function of aligned reads in libraries from unsorted human bone marrow, HSPCs, monocytes, and T cells. Libraries were generated from 100 sorted cells. 1000 cells were used for the unsorted bone marrow library. c, Heatmap depicts 4 kb regions (rows, n = 2,170 regions) centered over elements defined in the ENCODE SCREEN database. Only differentially methylated elements between monocytes and T cells are shown. Elements were stratified by their methylation status in HSPCs (hypomethylated: top; hypermethylated: bottom). Methylation levels of the unsorted bone marrow are shown for comparison (left). ATAC-seq signal for sorted hematopoietic stem cells (HSCs), monocyte, and CD4 + T cells (obtained from36).

Extended Data Fig. 9 Evaluation of single cell XRBS profiles.

a, Plot shows unique reads as a function of aligned reads in single cell XRBS profiles (n = 96 cells). With greater sequencing depth the fraction of unique reads decreases, as the chance of sampling a non-unique read (that is PCR duplicate) increases. b, Boxplots compare DNA methylation profiles from human scXRBS (n = 59 cells) and three published scRRBS datasets generated from human cells: Chronic lymphocytic leukemia (n = 282 cells)51, hepatocellular carcinoma and HepG2 cells (n = 34 cells)45, and oocytes, sperm and pronuclei (n = 35 cells)50. Single cells from Hou et al. were generated using the scTrio-seq protocol that in part resembles scRRBS. Only CpGs within 75 bases of an MspI cut site were considered for scRRBS libraries to adjust for differences in read lengths. Libraries from Gaiti et al. were sequenced at 2×51 bases. Left plot shows the number of paired-end reads sequenced for each cell. Other plots show the number of CpGs covered (≥1-fold) across all CpGs in the genome, CpGs within distal enhancer-like regions, and CpGs within ‘CTCF-only’ regions (SCREEN database35). Both strands of a CpG dinucleotide are assessed individually. Although sequenced at the lowest depth, scXRBS libraries on average capture the most CpGs, particularly in CpG-sparse regions. Boxplots were generated in R using default settings: Bounds of box and thick horizontal line represent 25th, 75th, and 50th percentile of observations, whiskers represent minimum and maximum observations, and outliers are indicated as dots. c, Barplot shows the fraction of unique reads (that is reads not representing PCR duplicates) per single cell library. Within the same PCR reaction, the duplicate rate was very similar, irrespective of the total number of aligned reads per single cell. Each bar plot represents a single cell XRBS library. Twenty four barcoded cells were in each of 4 independent libraries. d, Heatmap compares alternate allele frequencies from SNP array data for K562 and HL-60 cell lines. Cell line-specific homozygous alleles are indicated by white boxes boxes and were used for single cell SNP analysis in Fig. 5d. e, Plots show copy number variation calls from combined single cell XRBS profiles (top) and whole exome sequencing data (middle) for K562 cells. A number of chromosomes show differences in copy number between XRBS and whole exome sequencing (bottom). However, these differences likely represent true copy number variations between K562 cells used for these experiments. f, Heatmap shows pairwise correlation coefficients of single cell methylation profiles. Dendrogram shows unsupervised clustering. Single cell XRBS profiles cluster by cell type. g, Barplot shows K562 single cell average DNA methylation values within various early and late replicating regions. Each bar represents an individual K562 single cell library. There are 32 single cell libraries plotted for each cell cycle phase. h, Heatmap shows pairwise correlation of average DNA methylation values within various early and late replicating regions. Late replicating regions (G2 phase) cluster separately. These results suggest that one source of single cell DNA methylation heterogeneity is decreased fidelity of maintenance DNA methylation in late replicating domains.

Supplementary information

About this article

Verify currency and authenticity via CrossMark

Cite this article

Shareef, S.J., Bevill, S.M., Raman, A.T. et al. Extended-representation bisulfite sequencing of gene regulatory elements in multiplexed samples and single cells.
Nat Biotechnol (2021). https://doi.org/10.1038/s41587-021-00910-x

Download citation

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *