Compressed sensing for highly efficient imaging transcriptomics thumbnail

Compressed sensing for highly efficient imaging transcriptomics


Recent methods for spatial imaging of tissue samples can identify up to ~100 individual proteins1,2,3 or RNAs4,5,6,7,8,9,10 at single-cell resolution. However, the number of proteins or genes that can be studied in these approaches is limited by long imaging times. Here we introduce Composite In Situ Imaging (CISI), a method that leverages structure in gene expression across both cells and tissues to limit the number of imaging cycles needed to obtain spatially resolved gene expression maps. CISI defines gene modules that can be detected using composite measurements from imaging probes for subsets of genes. The data are then decompressed to recover expression values for individual genes. CISI further reduces imaging time by not relying on spot-level resolution, enabling lower magnification acquisition, and is overall about 500-fold more efficient than current methods. Applying CISI to 12 mouse brain sections, we accurately recovered the spatial abundance of 37 individual genes from 11 composite measurements covering 180 mm2 and 476,276 cells.

Access options

Subscribe to Journal

Get full journal access for 1 year


only $4.92 per issue

All prices are NET prices.

VAT will be added later in the checkout.

Tax calculation will be finalised during checkout.

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Data availability

We used publicly available snRNA-seq datasets released by BICCN (U19 Huang generated by the Regev lab; and full-length scRNA-seq (the Allen Institute Mouse Whole Cortex and Hippocampus SMART-seq (RRID:SCR_019013)). Raw image data from the large validation study are available for download at the Brain Image Library:

Code availability

An online repository of code used in this study can be found at Please see the accompanying Life Sciences Reporting Summary for additional information.


  1. 1.

    Angelo, M. et al. Multiplexed ion beam imaging of human breast tumors. Nat. Med. 20, 436–442 (2014).


    Google Scholar

  2. 2.

    Keren, L. et al. A structured tumor-immune microenvironment in triple negative breast cancer revealed by multiplexed ion beam imaging. Cell 174, 1373–1387 (2018).


    Google Scholar

  3. 3.

    Goltsev, Y. et al. Deep profiling of mouse splenic architecture with CODEX multiplexed imaging. Cell 174, 968–981 (2018).

  4. 4.

    Chen, K. H., Boettiger, A. N., Moffitt, J. R., Wang, S. & Zhuang, X. Spatially resolved, highly multiplexed RNA profiling in single cells. Science 348, aaa6090 (2015).


    Google Scholar

  5. 5.

    Shah, S., Lubeck, E., Zhou, W. & Cai, L. In situ transcription profiling of single cells reveals spatial organization of cells in the mouse hippocampus. Neuron 92, 342–357 (2016).

  6. 6.

    Shah, S., Lubeck, E., Zhou, W. & Cai, L. seqFISH accurately detects transcripts in single cells and reveals robust spatial organization in the hippocampus. Neuron 94, 752–758 (2017).

  7. 7.

    Wang, X. et al. Three-dimensional intact-tissue sequencing of single-cell transcriptional states. Science 361, eaat5691 (2018).

  8. 8.

    Wang, G., Moffitt, J. R. & Zhuang, X. Multiplexed imaging of high-density libraries of RNAs with MERFISH and expansion microscopy. Sci. Rep. 8, 4847 (2018).

  9. 9.

    Codeluppi, S. et al. Spatial organization of the somatosensory cortex revealed by osmFISH. Nat. Methods 15, 932–935 (2018).


    Google Scholar

  10. 10.

    Choi, H. M. T. et al. Third-generation in situ hybridization chain reaction: multiplexed, quantitative, sensitive, versatile, robust. Development 145, dev165753 (2018).


    Google Scholar

  11. 11.

    Raj, A., van den Bogaard, P., Rifkin, S. A., van Oudenaarden, A. & Tyagi, S. Imaging individual mRNA molecules using multiple singly labeled probes. Nat. Methods 5, 877–879 (2008).

  12. 12.

    Eng, C. H. L. et al. Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH. Nature 568, 235–239 (2019).

  13. 13.

    Cleary, B., Cong, L., Cheung, A., Lander, E. S. & Regev, A. Efficient generation of transcriptomic profiles by random composite measurements. Cell 171, 1424–1436 (2017).


    Google Scholar

  14. 14.

    Hrvatin, S. et al. Single-cell analysis of experience-dependent transcriptomic states in the mouse visual cortex. Nat. Neurosci. 21, 120–129 (2018).

  15. 15.

    Cleary, B. & Regev, A. The necessity and power of random, under-sampled experiments in biology. Preprint at (2020).

  16. 16.

    Abràmoff, M. D., Magalhães, P. J. & Ram, S. J. Image processing with ImageJ. Biophotonics International (2004).

  17. 17.

    Hörl, D. et al. BigStitcher: reconstructing high-resolution image datasets of cleared and expanded samples. Nat. Methods 16, 870–874 (2019).

  18. 18.

    Axelrod, S. et al. Starfish: open source image based transcriptomics and proteomics tools. (2020).

  19. 19.

    McQuin, C. et al. CellProfiler 3.0: next-generation image processing for biology. PLoS Biol. 16, e2005970 (2018).

Download references


We thank A. Hupalowska and L. Gaffney for help with figures; S. Farhi, Y. Eldar and members of the Cleary, Chen, Regev and Lander labs for helpful discussions; and the National Institute of Health’s (NIH) BICCN for open sharing of data before publication. This work was supported by BICCN (1RF1MH12128901) (B.C., A.R. and F.C.) and NIH 1U19MH114821 (A.R.), the Merkin Institute Fellowship at the Broad Institute (B.C.), the Klarman Cell Observatory, the Howard Hughes Medical Institute, the National Human Genome Research Institute Center of Excellence in Genome Science (RM1HG006193) (A.R.) and the Eric and Wendy Schmidt Fellows Program at the Broad Institute (F.C.).

Author information

Author notes

  1. Aviv Regev

    Present address: Genentech, South San Francisco, CA, USA


  1. Broad Institute of MIT and Harvard, Cambridge, MA, USA

    Brian Cleary, Brooke Simonton, Jon Bezney, Evan Murray, Shahul Alam, Anubhav Sinha, Ehsan Habibi, Jamie Marshall, Eric S. Lander & Fei Chen

  2. Harvard-MIT Division of Health Sciences and Technology, Cambridge, MA, USA

    Anubhav Sinha

  3. Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA

    Eric S. Lander & Aviv Regev

  4. Department of Systems Biology, Harvard Medical School, Boston, MA, USA

    Eric S. Lander

  5. Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA, USA

    Fei Chen

  6. Howard Hughes Medical Institute, Chevy Chase, MD, USA

    Aviv Regev

  7. Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, USA

    Aviv Regev


B.C., F.C. and A.R. conceived the study. B.S., J.B., B.C., E.M. and A.S. performed experiments, with assistance and feedback from J.M. B.C., S.A. and E.H. performed snRNA-seq data analysis and developed the image processing pipeline. B.C. developed and implemented the decompression algorithms. B.C., A.R., F.C., E.S.L. and B.S. wrote the manuscript, with input from all authors.

Corresponding authors

Correspondence to
Brian Cleary or Eric S. Lander or Fei Chen or Aviv Regev.

Ethics declarations

Competing interests

A.R. is a founder and equity holder of Celsius Therapeutics, an equity holder in Immunitas Therapeutics and, until August 31, 2020, was a Scientific Advisory Board member of Syros Pharmaceuticals, Neogene Therapeutics, Asimov and Thermo Fisher Scientific. From August 1, 2020, A.R. is an employee of Genentech, a member of the Roche Group. E.S.L. serves on the Board of Directors for Codiak BioSciences and Neon Therapeutics and serves on the Scientific Advisory Board of F-Prime Capital Partners and Third Rock Ventures. E.S.L. also serves on the Board of Directors of the Innocence Project, Count Me In and the Biden Cancer Initiative and on the Board of Trustees for the Parker Institute for Cancer Immunotherapy.

Additional information

Peer review information Nature Biotechnology thanks the anonymous reviewers for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Marker gene expression in snRNA-seq clusters.

For each of 37 genes, shown is the distribution of expression (individual violin plots; y-axis) in each of 23 snRNA-Seq clusters (x axis). Marker genes for similar cell types are grouped together with the cell type labeled on top.

Extended Data Fig. 2 Analysis of modular factorization based on gene and module diversity.

Pearson correlation (y-axis) between the original expression levels of 37 genes in each cell and those approximated in those cells by Sparse Module Activity Factorization (SMAF). Contour plots depict the density of cells at each level of correlation with either a given number of genes expressed (a; x-axis) or a given number of gene modules by SMAF decomposition (b; x-axis).

Extended Data Fig. 3 Evaluation of performance of simulated compositions.

Distribution of Pearson correlation between the original and recovered expression levels of 37 genes in each cell (y axis) across simulation trials for different numbers of composite measurements (a), or for different measurement densities, set by the maximum number of measurements in which each gene was included (b). In (a) the maximum compositions per gene is 3, and in (b) the number of compositions is 10. Mini boxplots depict median (center dots), inner quartiles (upper and lower bounds of box for 25th and 75th percentile), and 1.5x quartile range (minima and maxima of whiskers).

Extended Data Fig. 4 Autoencoder based decompression successfully recovers accurate spatial patterns of individual genes compared to direct measurement on the same section.

RNA images recovered by decompression with the segmentation free algorithm (magenta) and directly measured (green) in the same tissue section. White: images overlap exactly. Genes are grouped based on the section in which their direct measurements were made. Insets for all genes in a section show the same region, or an adjacent region if no cells for a given gene were present. Scale bar: 500um. Representative fields of view in each tissue section were chosen such that every gene validated in a tissue section could be visualized in the same region, while quantification of overlap (correlation) was calculated using all cells in a given tissue section, or using randomly selected testing cells (where indicated).

Extended Data Fig. 5 Comparison of autoencoding and segmentation-based decompression.

Individual gene images recovered (magenta) using the autoencoding algorithm (left) or the segmentation based algorithm (right) are overlaid with direct measurement (green) of the genes in the same tissue sections (white: direct overlap). For segmentation-based decompression, the decompressed signal for each gene is projected uniformly over each segmentation mask. Scale bar: 500um. Representative fields of view were selected to highlight expression of indicated genes, while quantification of overlap (correlation) was calculated using all cells in a given tissue section, or using randomly selected testing cells (where indicated).

Extended Data Fig. 6 Evaluation of recovered signals before and after co-measurement adjustment.

a,b, Adjustment improves recovered signals. Integrated signal intensity for each gene in each cell (individual dots) from direct measurements (x axis) and from estimates recovered by the autoencoder decompressed images (y axis) either before (a) and after (b) co-measurement correction. c, Example correction. Segmented cell intensities before (left) and after (right) correction for two co-measured genes (Hmha1 and Slc17a7) that were not correlated in snRNA-Seq.

Extended Data Fig. 7 Evaluation based on genes per cell and cell clusters.

a, Distribution of expression diversity (effective number of genes expressed per cell out of 37 total; y axis) in snRNA-Seq, or based on recovered expression levels using autoencoding or segmentation-based decompression (x axis). Mini boxplots depict median (center dots), inner quartiles (upper and lower bounds of box for 25th and 75th percentile), and 1.5x quartile range (minima and maxima of whiskers). b, Correspondence (Pearson’s correlation of mean gene expression; color bar) between cell clusters from snRNA-Seq (rows) and those found from post hoc segmentation of images recovered using the autoencoding algorithm (columns). One marker gene for each cluster is indicated.

Extended Data Fig. 8 CISI recapitulates clusters and conditional probabilities from scRNA-Seq.

a,b, Consistent identification of cell type specific gene programs in scRNA-Seq and CISI. The correlation coefficient (colorbar) between pairs of genes (row and column labels) in scRNA-Seq (a) and decompressed CISI measurements (b). Rows and columns are clustered. Gene clusters of cell type specific markers are labeled by the respective cell type. c,d, Consistent cell type expression patterns for IEGs in scRNA-Seq is CISI. Conditional probability (colorbars) of IEGs (columns) in cells that express a given gene (rows) in scRNA-Seq (c) and decompressed CISI (d) data.

Extended Data Fig. 9 Gene-level correlation with validation measurements.

For each gene (individual dots) validated in each tissue (colors) the correlation across all segmented cells (y-axis) between values recovered by CISI and directly measured values is plotted vs the average expression level (TPM) in cells expressing the gene in scRNA-Seq (x-axis, left), or the percentage of cells expressing the gene (x-axis, right). Individual data points labeled by gene are provided in Supplementary Table 10.

Supplementary information

About this article

Verify currency and authenticity via CrossMark

Cite this article

Cleary, B., Simonton, B., Bezney, J. et al. Compressed sensing for highly efficient imaging transcriptomics.
Nat Biotechnol (2021).

Download citation

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *