Mol. Cells 2023; 46(2): 74-85
Published online February 27, 2023
https://doi.org/10.14348/molcells.2023.2168
© The Korean Society for Molecular and Cellular Biology
Correspondence to : iksookim@gachon.ac.kr
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.
Single-cell research has provided a breakthrough in biology to understand heterogeneous cell groups, such as tissues and organs, in development and disease. Molecular barcoding and subsequent sequencing technology insert a singlecell barcode into isolated single cells, allowing separation cell by cell. Given that multimodal information from a cell defines precise cellular states, recent technical advances in methods focus on simultaneously extracting multimodal data recorded in different biological materials (DNA, RNA, protein, etc.). This review summarizes recently developed singlecell multiomics approaches regarding genome, epigenome, and protein profiles with the transcriptome. In particular, we focus on how to anchor or tag molecules from a cell, improve throughputs with sample multiplexing, and record lineages, and we further discuss the future developments of the technology.
Keywords molecular barcoding, multimodality, multiomics, single cell
Single-cell research has been widely used to determine the major cell types and subsets of heterogeneous samples, such as tissues or organs, from development and diseases (Choi and Kim, 2019; Elmentaite et al., 2022; Eze et al., 2021; He et al., 2020; Kumar et al., 2022; Lee and Park, 2021; Lu et al., 2022; Nomura, 2021; Strzelecka et al., 2018; Unterman et al., 2022). Although extensive validations will follow, the single-cell approach opens the field of identifying new cell types masked by bulk analysis from total lysed cells (Shalek et al., 2013; Wang and Bodovitz, 2010). This approach expands the knowledge of how cells construct tissues or organs and interact with surrounding cells (Domínguez Conde et al., 2022; Eraslan et al., 2022; Suo et al., 2022; Tabula Sapiens Consortium et al., 2022). However, there remain discrepancies between new cell subsets from single-cell analysis and what has been demonstrated through experimental investigations, including FACS (fluorescence-activated cell sorting) (Angerer et al., 2017; Weinreb et al., 2018). For decades, numerous single-cell barcoding technologies have been developed to improve efficiency and throughput to fill the gap or correct current knowledge (Chappell et al., 2018; Jovic et al., 2022; Tang et al., 2019). Recently, technological advances rendering multimodal information stored in different molecules, such as DNA, RNA, and proteins, from a single cell have been reported and are still expanding (Dey et al., 2015; Dimitriu et al., 2022; Frei et al., 2016; Genshaft et al., 2016; Hu et al., 2016; Lee et al., 2020; Macaulay et al., 2015; Perkel, 2021). It is known that abundant cellular information, such as chromatin structures, DNA methylation, and cell surface receptors, is dedicated to defining cell states. Thus, developing single-cell multiomics technology will allow various cellular information to be obtained, which will increase the understanding of complete cell states comprising biological systems. This review discusses various single-cell multiomics technologies based on how to target different molecules using unique molecular barcoding and what information that can be extracted. We also discuss recently developed multiomics technologies using next-generation sequencing at single-cell resolution.
Ligation of unique molecular barcodes in a single well of the multiwell plate simply provides barcoded single cells that will be demultiplexed by sequencing (Fig. 1) (Buenrostro et al., 2018; Kim et al., 2020; Xing et al., 2020). The multiomic approach originates from dividing amplified molecules into different plates, constructing a separated library of genomic DNA (gDNA) or RNA, and combining data by the coordinates of the position of wells. Microfluidic systems with droplets substantially increase throughputs (over 10 thousand cells). A flow containing diluted single-cell suspensions and another flow containing barcoded beads are merged in the flow focus, which continuously generates a droplet of a single cell with a unique barcode (Macosko et al., 2015; Prakadan et al., 2017). Although droplet generators limit the physical isolation of multiple molecules (DNA, RNA, or protein), capturing by unique tags (adapters) on the bead enables demultiplexing and simultaneous separation of each tag. Without physically isolating cells, split-and-pooling techniques (Split&Pooling) generate single-cell barcodes in the cell mixture by combinatorial chemistry (Cao et al., 2017; Rosenberg et al., 2018). The barcode process starts with lysed but preserved intact cells being deposited into several wells in a mixture. Molecules are uniquely barcoded by adapter ligation in a cell mixture, and they are then pooled and split into several wells followed by another round of unique barcode ligation. Two to three rounds of this process enable enough complexity to cover unique barcodes for all single cells. Because this method works with the ligation of nucleic acid barcodes, it supports the ligation of unique barcodes for different molecules simultaneously.
Multiomic approaches in single-cell technology fundamentally require capturing different cellular molecules simultaneously (Figs. 1 and 2, Table 1). Although the loss of materials when separating molecules is inevitable, physically dividing materials at the beginning of experiments allows easy handling and flexible application, such as constructing molecule-specific libraries.
The designed poly-T adapter selectively anchors the poly-A signal at the 3’ end of messenger RNA (mRNA) molecules. Biotinylated poly-T adapters and streptavidin-coated magnetic beads (poly-T-magnetic beads) have been commonly used to separate mRNA from lysed cells in well-based single-cell isolation. After pulling down mRNA at the bottom of the well, the supernatant is subjected to build a gDNA library in a separate plate. A cDNA library is then constructed in resuspension buffer from the beads, and single-cell barcodes are inserted by unique library adapters per well. For the gDNA library, transposition by a transposase (Tn5) inserts unique adapters into double-stranded gDNA. Due to steric hindrance, Tn5 binds to dsDNA where nucleosomes are free, providing the sequence of epigenetically regulated elements in which chromatin is accessible by regulatory proteins, such as transcription factors (TFs). Using this method assessed in ATAC-seq (Buenrostro et al., 2013), researchers have captured epigenetic states from transcriptome-based cell clusters of mouse embryonic stem cells (mESCs) in ASTAR-seq (Xing et al., 2020) and immune-cell profiles in sc(ATAC + RNA) sequencing (Reyes et al., 2019). Bisulfite conversion from cytosine to thymine (but not methylated cytosine) provides the DNA methylation status, depicting another layer of gene regulation. Given that whole genome bisulfite conversion requires naked DNA to have both heterochromatin and euchromatin, GpC methyltransferase is applied to record chromatin accessibility, which generates methylated cytosine on the GC of the open chromatin region; subsequent bisulfite conversion in the extracted gDNA enables a record of chromatin accessibility and genome-wide DNA methylation simultaneously. In scNMT-seq, GpC methylation and bisulfite conversion with biotin-dT-mediated RNA-seq reveal dynamic coupling epigenetic states among transcriptome-based cell clusters in mESC differentiation (Clark et al., 2018). A similar protocol called scChaRM-seq is applied to human oocytes and ovarian somatic cells, providing a detailed map of epigenetic landscapes at single-cell resolution (Yan et al., 2021).
Conversely, breaking down only the cytoplasmic membrane by optimized lysis buffer and then spinning down the intact nucleus can capture gDNA as enclosed storage. This nuclear isolation method preserves many approaches targeting the genome and epigenome in samples separated from RNA. In scCAT-seq, Tn5 transposition in the plate where the nucleus is spun down provides regulatory relationships between chromatin accessibility and the transcriptome in early embryos (Liu et al., 2019). To reduce the possibility of material loss or contamination when separating, magnet-incorporating antibodies against surface antigens of the nucleus hold nuclei at the bottom of the well in scSIDR-seq (Han et al., 2018). Furthermore, genetic alterations, such as copy-number variations (CNVs) and single-nucleotide variations (SNVs), can be captured by whole genome sequencing from the nuclei. To obtain the DNA methylation status with the transcriptome, bisulfite treatment of the extracted gDNA from the magnet-assisted, isolated nucleus of colorectal cancer cells enables whole genome bisulfite sequencing with SNV analysis in scTrio-seq2 (Bian et al., 2018). In scNOMeRe-seq, GpC methyltransferase generates methylated cytosine on the open chromatin region from the magnet-assisted, isolated single-cell nucleus (Wang et al., 2021). Bisulfite conversion enables a record of chromatin accessibility and genome-wide DNA methylation simultaneously. With RNA-seq from separated samples, this approach reveals that coordination among multiple epigenetic layers regulates the reconstruction of genetic lineages in early embryos.
In the approaches the Split&Pooling method applied, nuclei isolation and barcoded oligo-dT captures are used simultaneously, providing single-cell multiomics at low cost and on a massive scale. In SHARE-seq, Tn5 transposition and biotin-oligo-dT mediated reverse transcription (RT) for cDNA were done in bulk, fixed nuclei (Ma et al., 2020). Then well-specific barcoded oligonucleotides were distributed to the well plate for hybridization as a molecule identifier. Separated cDNA by streptavidin beads and chromatin accessibility by Tn5 provides evidence that chromatin accessibility precedes gene expression during lineage commitment of mouse skin in a single resolution. In Paired-seq, although barcode generation by rounds of split and pooling is similar to SHARE-seq, this method adopts restriction enzymes to cut gDNA or RNA at the pre-designed sites of adapters from split samples without oligo-dT based physical isolation of cDNA (Zhu et al., 2019). These transcriptome-based cell-type-specific gene regulatory programs have been applied to the developing forebrain.
ProteinProteins are not feasible to separate cells from others because they do not include common sequences to be used as anchors, and they are not compartmentalized to specific cellular organelles but rather distributed to the cytoplasm and nucleus. The scSTAP method uses sampling splitting for the transcriptome library in half and the protein library in another half for submission to LS-MS/MS (Jiang et al., 2022). Although this well-based approach has disadvantages due to the loss of samples in splitting, it allows the characterization of protein profiles in a single cell with the transcriptome. Molecular tagging is a more feasible way to capture target proteins, which will be discussed later.
Inserting unique molecular barcodes (tags) into each cellular material and demultiplexing through the sequencing platform allows material isolation without physical separation (Figs. 1 and 2, Table 1). The advantages of tagging molecules are to reduce the loss of materials while separating molecules and to overcome the limitations of sample handling in such a low volume. Tagging molecules also allow preamplification of materials, which splits the sample into aliquots for material-specific library construction.
Different molecular tags applied to a monomeric cellular material can provide multimodal information that a single material includes in a cell. For investigating other information in 5’ or 3’ regions of RNA, such as B cell or T cell receptor sequences, separate PCR assays using different primers targeting template switching oligo (TSO) at 5’ (Picelli et al., 2014) or oligo-dT at 3’ provide region-specific single-cell libraries demonstrating cell subtype-specific isoforms (Hu et al., 2020). For comparison between cytosolic RNA and nuclear RNA, the microfluidic vessel retains the nucleus and passes lysed cytoplasm through the channel, allowing separate library construction and revealing regulatory networks in RNA processing (Abdelmoez et al., 2018). Among the information extracted from gDNA, researchers can simultaneously capture chromatin structure with DNA methylation in a single cell. After crosslinking, enzyme digestion, and ligation in 3C or Hi-C methods to maintain chromatin conformation, consecutive bisulfite conversion provides DNA methylation profiles in the sequencing of the cytosine to thymine ratio in scMethyl-HiC (Li et al., 2019) and sn-m3c-seq (Lee et al., 2019; Li et al., 2019). Chromatin accessibility and DNA methylation are obtained by GpC methylation followed by bisulfite conversion in iscCOOL-seq (Gu et al., 2019). RETrace provides a method to digest DNA differentially by region-specific enzymes, and the fragments are separately captured by unique adapters (Wei and Zhang, 2020). CUT&Tag2for1 uses Tn5-conjugated secondary antibodies targeting primary antibodies against both active and repressive marks in chromatin, providing both active and repressive states from a DNA region in a cell (Janssens et al., 2022).
Targeted amplification of specific regions in another material (such as gDNA) is relatively well merged with conventional single-cell RNA-seq. In TARGET-seq, the mutation hotspot in a cancerous cell is amplified by site-specific adapters in the portion of gDNA, providing genetic tumor heterogeneity of mutant and nonmutant cells (Rodriguez-Meira et al., 2019). In sc(Insert + RNA), inserted lineage barcodes in gDNA are amplified by site-specific adapters followed by consecutive cDNA amplification, validating lineage commitments at fate branching points in developing organoids (Kim et al., 2020). Poly-dT-mediated cDNA/second-strand synthesis followed by preamplification allows the splitting of samples into small aliquots (10%-20%) that generate an additional target-specific library. For monitoring TF-binding sites, scDAM&T-seq uses DNA methyltransferase (DAM) fused to an antibody against TFs (Rooijers et al., 2019), and it records sequence-specific (GmATC) modification around the TF-binding site that is further recognized by a specific adapter identifying the site of DpnI digestion (cutting GmATC but not GATC). Targeted extraction is also applied to verify the guide RNA identity of CRISPR/Cas9 experiments with chromatin accessibility in Perturb-ATAC (Rubin et al., 2019).
To leverage epigenomic states recorded in gDNA and the transcriptome without isolation of RNA or nuclei, Tn5 transposition inserts unique adapters into double-stranded DNA, thereby distinguishing DNA from single-stranded RNA. Thus, gDNA is tagged by Tn5, and the poly-dT adapter accompanies mRNA for tagging. In sci-CAR-seq (Cao et al., 2018), the Split&Pooling method uses an oligo-dT-adapter bearing a well-specific barcode for mRNA indexing and a Tn5 transposase bearing the same barcode for gDNA indexing. Second-strand synthesis of cDNA allows split samples for separated library construction. In SNARE-seq, a nucleus is captured in a droplet after Tn5-transposition, and subsequent poly-T mediated RT renders successful incorporation of a unique molecular adapter to gDNA and mRNA (Chen et al., 2019). Gel beads coated with counter adapters capture each molecular tag. Although the fundamental function of Tn5 is not to tag single-stranded nucleic acids, such as mRNA, researchers have found that Tn5 can insert an adapter into cDNA/RNA (Di et al., 2020; Lu et al., 2020). In ISSAAC-seq (Xu et al., 2022), applying Tn5 with unique adapters generates gDNA tags in the mixture with mRNA, and using another set of Tn5 with different adapters provides an mRNA tag by inserting the adapter into the mRNA/cDNA hybrid after RT. These results provide the dynamics of the covariance of chromatin accessibility and transcription across heterogeneous tissues, such as the kidney or cerebral cortex. To assess DNA methylation and chromatin accessibility with the transcriptome, snmCAT-seq uses me-dCTP to generate fully methylated double-stranded cDNA in a lysed nucleus with the treatment of GpC methyltransferase (Luo et al., 2022). After bisulfite treatment, the cytosine to thymine ratio in CpG sequences discriminates the amplified RNA library (high) and DNA library (low). This joint profiling provides specific enrichment of genetic risk for neuropsychiatric traits in the postmortem human frontal cortex.
Tagging protein with the transcriptomeDue to limited materials in a cell and sequencing-based readouts of RNA, protein tags by nucleic acid adapters bearing specific barcodes enable single-cell multiomics for protein profiling. CITE-seq (Stoeckius et al., 2017) and REAP-seq (Peterson et al., 2017) introduce a protein-specific antibody conjugated to poly-A oligonucleotides containing a barcode for antibody identification. After the interaction of poly-A-conjugated antibodies with surface proteins in bulk cells, a cell with bound antibodies is isolated in a droplet, and a 10× Genomics 3’ RNA bead coated by barcoded oligo-dT simultaneously captures the poly-A conjugated antibody and poly-A tails of mRNA. inCITE-seq uses fixation and permeabilization on the nucleus, allowing oligo-conjugated antibodies to penetrate intranuclear proteins (Chung et al., 2021), which has been used to understand how a combination of TFs configures gene expression in the mouse brain. Additionally, in RAID (Gerlach et al., 2019), mild fixation and permeabilization are simultaneously applied to an intact cell to provide the joint profiles of cell surface proteins and intracellular proteins. Instead of 10× Genomics 3’ RNA beads, the 10× Genomics 5P/V(D)J kit provides RNA beads containing TSO sequences (GGG) to capture the terminal transferase-mediated cytosine repeat (CCC) sequences at the 5’ end of the transcriptome. In ECCITE-seq (Mimitou et al., 2019), a cell stained with antibodies conjugated to CCC-containing oligos is applied to a droplet, and the TSO reaction with RT using both poly-T primers and custom RT primers targets the guide RNA of the CRISPR/Cas9 system. This technique has been used to provide surface protein expression, 5’ end-specific transcriptome, and T cell receptor sequences with guide RNA identity of CRISPR perturbation analysis in human peripheral blood mononuclear cells (PBMCs).
Tagging protein with the epigenomeTo record chromatin accessibility with protein profiling, Tn5 transposition is performed in permeabilized cells after in situ antibody staining. In ICICLE-seq, a cell stained by a poly-A sequence-conjugated antibody is permeabilized by digitonin followed by tagmentation with by customized Tn5-containing poly-A adapters (Swanson et al., 2021). Antibodies and gDNA are captured by 10× Genomics 3’ RNA beads coated with barcoded poly-T. Similarly, ASAP-seq uses oligo-conjugated antibodies to stain cells with the modification that adds bridging oligos to render conjugated oligos comparable to 10× Genomics single-cell ATAC beads (Mimitou et al., 2021). In addition, fixation and permeabilization before Tn5 tagmentation in this approach allow for retaining mitochondrial DNA for additional analysis, such as single-cell lineage tracing by mutations. Although oligo-conjugated antibody panels are increasingly used to detect various epitopes, scaling these panels for protein profiling is challenging. The recently developed PHAGE-ATAC adopts phage display processes rendering epitopes of interest on the outside of bacteriophages (Fiskin et al., 2022). Inserting a gene encoding a nanobody recognizing an epitope and unique adapters into a gene for phage coat protein allows engineering of a phagemid library against all proteins of interest. The unique adapters encoding nanobody identities are then captured by 10× Genomics beads.
Profiles of protein, DNA, and RNARecently developed gel beads bearing barcoded oligo-dT and Tn5 adapters (10× Genomics) increase the capacity to capture different molecules without competing for binding to identical adapters on the bead among molecules. This platform allows for the simultaneous capture of the majority of materials bearing cellular information, including DNA, RNA, and protein. TEA-seq (Swanson et al., 2021), which was developed from ICICLE-seq, uses commercial Tn5 with unique adapters that are distinguished from oligo-dT-captured RNA and poly-A-conjugated antibody identity by the 10× Genomics multiome platform. DOGMA-seq (Mimitou et al., 2021), which was developed from ASAP-seq, also uses the same platform to capture mRNA with Tn5 transposition and protein of interest, with clone tracing by mitochondrial mutation. DOGMA-seq has been used to reveal regulatory networks in chromatin, RNA, and surface proteins during hematopoietic differentiation and PBMC stimulation. A similar method, NEAT-seq, improves the detection of nuclear proteins leveraging nonspecific binding of oligos by adding E. coli ssDNA-binding proteins to reduce backgrounds (Chen et al., 2022). This approach profiles subsets of CD4 memory T cells by TFs with regulatory activity through transcription, translation, and chromatin accessibility.
In single-cell multiomics technologies using antibody-derived tags (ADTs), fixation and permeabilization for capturing surface protein profiles with Tn5 transposition increase cell doublets beyond filtering in silico. To this end, cell hashing through unique barcodes is performed by adding hashtag oligo (HTO)-conjugated antibodies targeting universal cell surface antigens, such as glycoproteins (Chen et al., 2022; Mimitou et al., 2019, 2021; Swanson et al., 2021). Filtering out multiple HTOs per cell reduces doublet-mediated backgrounds in fixation. Conversely, the cell hashtag enables sample multiplexing by distinguishing multiplexed samples. In scifi-RNA-seq (Datlinger et al., 2021), permeabilized cells or nuclei on different wells are preindexed with barcoded oligo-dT primers by RT. The overloading of cells per droplet is then computationally demultiplexed to individual cells by the barcode. Instead of oligo-conjugated antibodies, lipid- and cholesterol-modified oligonucleotides (LMOs and CMOs) incorporate barcodes into the plasma membrane for cell hashing in MULTI-seq (McGinnis et al., 2019). The various combinations between streptavidin-conjugated oligo-dT barcodes and biotin-conjugated concanavalin A beads massively increase the cell hashing capacity in CASB method (Fang et al., 2021). These approaches allow thousand-sample multiplexing, which, in theory, reduces costs.
Applying unique cell barcodes to the initiation of developing systems allows for tracing the origins of cells, particularly in the system of embryo development and disease. Integration of random sequences or curated semirandom sequences theoretically provides identifiers for cells with their clones at the sampling time (Bhang et al., 2015). Clone identifiers help distinguish amplified or diminished cells (clones) from others at the starting points when selective pressure is applied, such as lineage commitment, cancer metastasis, and reoccurrence of cancer. This approach helps to elucidate when a cell decides its fate in development and which cancer cells survive after drug treatment (Eyler et al., 2020; Kim et al., 2020; Weinreb et al., 2020). To record timely resolved cell lineages, introducing the sequence-evolving barcode into cells provides more capacity to record cells and their lineages in various time windows. Insertion and deletion mutants of CRISPR/Cas9-mediated target sequences provide accumulation of progressed mutations over time before the ruins of the PAM (protospacer adjacent motif) sequence. Based on barcode similarity and mutation hierarchy, these methods recapitulate relationships among different tissues and the ancestry of germ layers (extraembryonic and embryonic endoderm), providing a complete reconstruction of cell lineages (Chan et al., 2019; Frieda et al., 2017; Raj et al., 2018). Recently, these techniques have been advanced with terminal transferase and prime-editing methods to avoid PAM-sequence dependency in the recording (Choi et al., 2022; Loveless et al., 2021). As a limitation of use in live cells, acquired mutations in genomic or mitochondrial DNA (Mimitou et al., 2021; Park et al., 2021; Swanson et al., 2021) provide a distinct barcode identity and allow tracing of cells and clones in the development and disease of humans. Innate variable sequences will be another source of cell barcodes to reveal the identity of cell lineages that appear in response to the environment (Zhang et al., 2018).
Most cell states responding from and reporting to the environment are recorded in various cellular molecules, such as DNA, RNA, and proteins. Thus, reconstructing cellular states responding to biological events, such as development or diseases, requires extracting all the materials containing that information. To this end, single-cell barcoding technologies are rapidly updated and improved to capture multimodal data from a cell. Using the molecular identity of cellular materials, separating DNA or RNA allows the construction of both libraries simultaneously. Without physical separation, molecular tagging with single-cell barcodes expands target materials to multiple at a time. Furthermore, the cell hashing strategy incorporates hashtag barcodes into cell groups, thereby extending throughputs in complex samples. However, in current methods, there is still a need for materials containing the necessary information. Naked DNA, RNA, and proteins do not include timely progressed cell states. Examples of this include the following modifications: phosphorylation or methylation of proteins; noncoding RNAs with no poly-A tails; metabolite or ion exchanges in neuronal responses; and cell-to-cell communication by secreted molecules. It is apparent that abnormal signs of these events frequently represent a cell in diseases, such as cancer. Although recent advances in spatial transcriptomics adopt methods developed with imaging technology, their own limitations should be solved (Eng et al., 2019; Rodriques et al., 2019; Williams et al., 2022). Therefore, breakthroughs in single-cell barcoding technology will be required to analyze the actual cellular states representing biological systems and raise new ideas to solve fundamental questions in biology.
This work was supported by the National Research Foundation (NRF) of Korea (2021R1F1A104962311) and the Gachon University Research Fund (GCU-202102820001).
I.S.K. wrote the manuscript.
The author has no potential conflicts of interest to disclose.
Summary of recently published single-cell multiomics studies
Method | Genome | Transcript | Protein | Isolation method | Throughput | Capturing | Cell | Reference | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Genome | Epigenome1 | Epigenome2 | Whole cDNA | Partial cDNA | Surface | Intra-cellular | ||||||||
(Epi)genome profiling: DNA/RNA | ||||||||||||||
sc(ATAC+RNA) | ChromAcc (Tn5) | Full-length | FACS | Low (~96) | Poly-dT separation (biotin) | Intact | (Reyes et al., 2019) | |||||||
ASTAR-seq | ChromAcc (Tn5) | Full-length | Droplet, C1 | Mid (96×) | Poly-dT separation (biotin) | Intact | (Xing et al., 2020) | |||||||
SHARE-seq | ChromAcc (Tn5) | 3'-end | Split&Pool | High | Poly-dT separation (biotin) | Intact, Nucei | (Ma et. al., 2020) | |||||||
Paired-seq | ChromAcc (Tn5) | 3'-end | Split&Pool | High | Adapter ligation | Nucei | (Zhu et. al., 2019) | |||||||
sci-CAR | ChromAcc (Tn5) | 3'-end | Split&Pool | High | Material tagging, splitting | Nucei | (Cao et al., 2018) | |||||||
SNARE-seq | ChromAcc (Tn5) | 3'-end | Droplet, 10× | High | Material tagging, splitting | Nucei | (Chen et al., 2019) | |||||||
ISSAAC-seq | ChromAcc (Tn5) | 3'-end | Droplet, 10× | High | Material tagging, splitting | Nucei | (Xu et al., 2022) | |||||||
scCAT-seq | ChromAcc (Tn5) | Full-length | FACS | Low (~96) | Nucei separation (spin) | Intact | (Liu et al., 2019) | |||||||
scSIDR-seq | CNV | Full-length | FACS | Low (~48) | Nucei separation (ab/magnet) | Intact | (Han et al., 2018) | |||||||
scChaRM-seq | DNAme (WGBS) | ChromAcc (GpC) | 3'-end | Hand picking | Low (~96) | Poly-dT separation (biotin) | Intact | (Yan et al., 2021) | ||||||
scNMT-seq | DNAme (WGBS) | ChromAcc (GpC) | Full-length | FACS | Low (~96) | Poly-dT separation (biotin) | Intact | (Clark et al., 2018) | ||||||
scTrio-seq2 | CNV | DNAme (RRBS) | 3'-end | Hand picking | Low (~96) | Nucei separation (magnet) | Intact | (Bian et al., 2018) | ||||||
scNOMeRe-seq | DNAme (WGBS) | ChromAcc (GpC) | Full-length | Hand picking | Low (~96) | Nucei separation (magnet) | Intact | (Wang et al., 2021) | ||||||
snmCAT-seq | DNAme (WGBS) | ChromAcc (GpC) | Full-length | FACS | Low (~384) | Material tagging (C, mC) | Nucei | (Luo et al., 2022) | ||||||
Targeted amplification | ||||||||||||||
scDAM&T-seq | TF/DNA binding | 3'-end | FACS | Low (~384) | Material tagging | Intact | (Rooijers et al., 2019) | |||||||
TARGET-seq | Mutation | Full-length | FACS | Low (~384) | Material tagging, splitting | Intact | (Rodriguez-Meira et al., 2019) | |||||||
sc(Insert + RNA) | Targeted | 3'-end | FACS | Low (~384) | Material tagging, splitting | Intact | (Kim et al., 2020) | |||||||
Perturb-ATAC | ChromAcc (Tn5) | gRNA | Droplet, C1 | Mid (96×) | Material tagging, splitting | Intact | (Rubin et al., 2019) | |||||||
Multimodal information in a molecule | ||||||||||||||
scMethyl-HiC | DNAme (WGBS) | Hi-C | FACS | Low (~96) | DNA (multimodality) | Nucei | (Li et al., 2019) | |||||||
RETrace | Micro-satellite | DNAme (RRBS) | FACS | Low (~96) | DNA (multimodality) | Intact | (Wei and Zhang, 2020) | |||||||
sn-m3C-seq | DNAme (WGBS) | Hi-C (m3C) | FACS | Low (~384) | DNA (multimodality) | Nucei | (Lee et al., 2019) | |||||||
iscCOOL-seq | DNAme (WGBS) | ChromAcc (GpC) | Hand picking | Low (~96) | DNA (multimodality) | Intact | (Gu et al., 2019) | |||||||
CUT&Tag2for1 | Chrom_Act, | Chrom_Rep | Sorter | Mid (~5,184) | DNA (multimodality) | Nucei | (Janssens et al., 2022) | |||||||
scRCAT-seq | 5'-end, 3'-end | Hand picking | Low (~96) | RNA (multimodality) | Intact | (Hu et al., 2020) | ||||||||
SINC-seq | cytRNA, nucRNA | Microfluidic device | High | RNA (multimodality) | Intact | (Abdelmoez et al., 2018) | ||||||||
Protein profiling with RNA | ||||||||||||||
REAP-seq | 3'-end | Surface | Droplet, 10× | High | Material tagging | Intact | (Peterson et al., 2017) | |||||||
CITE-seq | 3'-end | Surface | Droplet, 10× | High | Material tagging | Intact | (Stoeckius et al., 2017) | |||||||
inCITE-seq | 3'-end | Intra-nuclear | Droplet, 10× | High | Material tagging | Nucei | (Chung et al., 2021) | |||||||
RAID | 3'-end | Surface | Intra | FACS | Low (~384) | Material tagging | Intact | (Gerlach et al., 2019) | ||||||
ECCITE-seq | gRNA + TCR | 5'-end | Surface + cell hashing | Droplet, 10× | High | Material tagging | Intact | (Mimitou et al., 2019) | ||||||
scSTAP | Full-length | Surface | Intra | Hand picking | Low (~96) | Material tagging, splitting | Intact | (Jiang et al., 2022) | ||||||
Protein profiling with chromatin | ||||||||||||||
PHAGE-ATAC | Mitochondria | ChromAcc (Tn5) | Surface | Intra | Droplet, 10× | High | Material tagging | Intact | (Fiskin et al., 2022) | |||||
ASAP-seq | Mitochondria | ChromAcc (Tn5) | Surface + cell hashing | Droplet, 10× | High | Material tagging | Intact | (Mimitou et al., 2021) | ||||||
ICICLE-seq | ChromAcc (Tn5) | Surface | Droplet, 10× | High | Material tagging | Intact | (Swanson et al., 2021) | |||||||
DNA/RNA/protein | ||||||||||||||
DOGMA-seq | Mitochondria | ChromAcc (Tn5) | 3'-end | Surface + cell hashing | Droplet, 10× | High | Material tagging | Intact | (Mimitou et al., 2021) | |||||
TEA-seq | ChromAcc (Tn5) | 3'-end | Surface | Droplet, 10× | High | Material tagging | Intact | (Swanson et al., 2021) | ||||||
NEAT-seq | ChromAcc (Tn5) | 3'-end | Surface + cell hashing | Intra | Droplet, 10× | High | Material tagging | Intact | (Chen et al., 2022) |
Mol. Cells 2023; 46(2): 74-85
Published online February 28, 2023 https://doi.org/10.14348/molcells.2023.2168
Copyright © The Korean Society for Molecular and Cellular Biology.
Department of Microbiology, Gachon University College of Medicine, Incheon 21999, Korea
Correspondence to:iksookim@gachon.ac.kr
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.
Single-cell research has provided a breakthrough in biology to understand heterogeneous cell groups, such as tissues and organs, in development and disease. Molecular barcoding and subsequent sequencing technology insert a singlecell barcode into isolated single cells, allowing separation cell by cell. Given that multimodal information from a cell defines precise cellular states, recent technical advances in methods focus on simultaneously extracting multimodal data recorded in different biological materials (DNA, RNA, protein, etc.). This review summarizes recently developed singlecell multiomics approaches regarding genome, epigenome, and protein profiles with the transcriptome. In particular, we focus on how to anchor or tag molecules from a cell, improve throughputs with sample multiplexing, and record lineages, and we further discuss the future developments of the technology.
Keywords: molecular barcoding, multimodality, multiomics, single cell
Single-cell research has been widely used to determine the major cell types and subsets of heterogeneous samples, such as tissues or organs, from development and diseases (Choi and Kim, 2019; Elmentaite et al., 2022; Eze et al., 2021; He et al., 2020; Kumar et al., 2022; Lee and Park, 2021; Lu et al., 2022; Nomura, 2021; Strzelecka et al., 2018; Unterman et al., 2022). Although extensive validations will follow, the single-cell approach opens the field of identifying new cell types masked by bulk analysis from total lysed cells (Shalek et al., 2013; Wang and Bodovitz, 2010). This approach expands the knowledge of how cells construct tissues or organs and interact with surrounding cells (Domínguez Conde et al., 2022; Eraslan et al., 2022; Suo et al., 2022; Tabula Sapiens Consortium et al., 2022). However, there remain discrepancies between new cell subsets from single-cell analysis and what has been demonstrated through experimental investigations, including FACS (fluorescence-activated cell sorting) (Angerer et al., 2017; Weinreb et al., 2018). For decades, numerous single-cell barcoding technologies have been developed to improve efficiency and throughput to fill the gap or correct current knowledge (Chappell et al., 2018; Jovic et al., 2022; Tang et al., 2019). Recently, technological advances rendering multimodal information stored in different molecules, such as DNA, RNA, and proteins, from a single cell have been reported and are still expanding (Dey et al., 2015; Dimitriu et al., 2022; Frei et al., 2016; Genshaft et al., 2016; Hu et al., 2016; Lee et al., 2020; Macaulay et al., 2015; Perkel, 2021). It is known that abundant cellular information, such as chromatin structures, DNA methylation, and cell surface receptors, is dedicated to defining cell states. Thus, developing single-cell multiomics technology will allow various cellular information to be obtained, which will increase the understanding of complete cell states comprising biological systems. This review discusses various single-cell multiomics technologies based on how to target different molecules using unique molecular barcoding and what information that can be extracted. We also discuss recently developed multiomics technologies using next-generation sequencing at single-cell resolution.
Ligation of unique molecular barcodes in a single well of the multiwell plate simply provides barcoded single cells that will be demultiplexed by sequencing (Fig. 1) (Buenrostro et al., 2018; Kim et al., 2020; Xing et al., 2020). The multiomic approach originates from dividing amplified molecules into different plates, constructing a separated library of genomic DNA (gDNA) or RNA, and combining data by the coordinates of the position of wells. Microfluidic systems with droplets substantially increase throughputs (over 10 thousand cells). A flow containing diluted single-cell suspensions and another flow containing barcoded beads are merged in the flow focus, which continuously generates a droplet of a single cell with a unique barcode (Macosko et al., 2015; Prakadan et al., 2017). Although droplet generators limit the physical isolation of multiple molecules (DNA, RNA, or protein), capturing by unique tags (adapters) on the bead enables demultiplexing and simultaneous separation of each tag. Without physically isolating cells, split-and-pooling techniques (Split&Pooling) generate single-cell barcodes in the cell mixture by combinatorial chemistry (Cao et al., 2017; Rosenberg et al., 2018). The barcode process starts with lysed but preserved intact cells being deposited into several wells in a mixture. Molecules are uniquely barcoded by adapter ligation in a cell mixture, and they are then pooled and split into several wells followed by another round of unique barcode ligation. Two to three rounds of this process enable enough complexity to cover unique barcodes for all single cells. Because this method works with the ligation of nucleic acid barcodes, it supports the ligation of unique barcodes for different molecules simultaneously.
Multiomic approaches in single-cell technology fundamentally require capturing different cellular molecules simultaneously (Figs. 1 and 2, Table 1). Although the loss of materials when separating molecules is inevitable, physically dividing materials at the beginning of experiments allows easy handling and flexible application, such as constructing molecule-specific libraries.
The designed poly-T adapter selectively anchors the poly-A signal at the 3’ end of messenger RNA (mRNA) molecules. Biotinylated poly-T adapters and streptavidin-coated magnetic beads (poly-T-magnetic beads) have been commonly used to separate mRNA from lysed cells in well-based single-cell isolation. After pulling down mRNA at the bottom of the well, the supernatant is subjected to build a gDNA library in a separate plate. A cDNA library is then constructed in resuspension buffer from the beads, and single-cell barcodes are inserted by unique library adapters per well. For the gDNA library, transposition by a transposase (Tn5) inserts unique adapters into double-stranded gDNA. Due to steric hindrance, Tn5 binds to dsDNA where nucleosomes are free, providing the sequence of epigenetically regulated elements in which chromatin is accessible by regulatory proteins, such as transcription factors (TFs). Using this method assessed in ATAC-seq (Buenrostro et al., 2013), researchers have captured epigenetic states from transcriptome-based cell clusters of mouse embryonic stem cells (mESCs) in ASTAR-seq (Xing et al., 2020) and immune-cell profiles in sc(ATAC + RNA) sequencing (Reyes et al., 2019). Bisulfite conversion from cytosine to thymine (but not methylated cytosine) provides the DNA methylation status, depicting another layer of gene regulation. Given that whole genome bisulfite conversion requires naked DNA to have both heterochromatin and euchromatin, GpC methyltransferase is applied to record chromatin accessibility, which generates methylated cytosine on the GC of the open chromatin region; subsequent bisulfite conversion in the extracted gDNA enables a record of chromatin accessibility and genome-wide DNA methylation simultaneously. In scNMT-seq, GpC methylation and bisulfite conversion with biotin-dT-mediated RNA-seq reveal dynamic coupling epigenetic states among transcriptome-based cell clusters in mESC differentiation (Clark et al., 2018). A similar protocol called scChaRM-seq is applied to human oocytes and ovarian somatic cells, providing a detailed map of epigenetic landscapes at single-cell resolution (Yan et al., 2021).
Conversely, breaking down only the cytoplasmic membrane by optimized lysis buffer and then spinning down the intact nucleus can capture gDNA as enclosed storage. This nuclear isolation method preserves many approaches targeting the genome and epigenome in samples separated from RNA. In scCAT-seq, Tn5 transposition in the plate where the nucleus is spun down provides regulatory relationships between chromatin accessibility and the transcriptome in early embryos (Liu et al., 2019). To reduce the possibility of material loss or contamination when separating, magnet-incorporating antibodies against surface antigens of the nucleus hold nuclei at the bottom of the well in scSIDR-seq (Han et al., 2018). Furthermore, genetic alterations, such as copy-number variations (CNVs) and single-nucleotide variations (SNVs), can be captured by whole genome sequencing from the nuclei. To obtain the DNA methylation status with the transcriptome, bisulfite treatment of the extracted gDNA from the magnet-assisted, isolated nucleus of colorectal cancer cells enables whole genome bisulfite sequencing with SNV analysis in scTrio-seq2 (Bian et al., 2018). In scNOMeRe-seq, GpC methyltransferase generates methylated cytosine on the open chromatin region from the magnet-assisted, isolated single-cell nucleus (Wang et al., 2021). Bisulfite conversion enables a record of chromatin accessibility and genome-wide DNA methylation simultaneously. With RNA-seq from separated samples, this approach reveals that coordination among multiple epigenetic layers regulates the reconstruction of genetic lineages in early embryos.
In the approaches the Split&Pooling method applied, nuclei isolation and barcoded oligo-dT captures are used simultaneously, providing single-cell multiomics at low cost and on a massive scale. In SHARE-seq, Tn5 transposition and biotin-oligo-dT mediated reverse transcription (RT) for cDNA were done in bulk, fixed nuclei (Ma et al., 2020). Then well-specific barcoded oligonucleotides were distributed to the well plate for hybridization as a molecule identifier. Separated cDNA by streptavidin beads and chromatin accessibility by Tn5 provides evidence that chromatin accessibility precedes gene expression during lineage commitment of mouse skin in a single resolution. In Paired-seq, although barcode generation by rounds of split and pooling is similar to SHARE-seq, this method adopts restriction enzymes to cut gDNA or RNA at the pre-designed sites of adapters from split samples without oligo-dT based physical isolation of cDNA (Zhu et al., 2019). These transcriptome-based cell-type-specific gene regulatory programs have been applied to the developing forebrain.
ProteinProteins are not feasible to separate cells from others because they do not include common sequences to be used as anchors, and they are not compartmentalized to specific cellular organelles but rather distributed to the cytoplasm and nucleus. The scSTAP method uses sampling splitting for the transcriptome library in half and the protein library in another half for submission to LS-MS/MS (Jiang et al., 2022). Although this well-based approach has disadvantages due to the loss of samples in splitting, it allows the characterization of protein profiles in a single cell with the transcriptome. Molecular tagging is a more feasible way to capture target proteins, which will be discussed later.
Inserting unique molecular barcodes (tags) into each cellular material and demultiplexing through the sequencing platform allows material isolation without physical separation (Figs. 1 and 2, Table 1). The advantages of tagging molecules are to reduce the loss of materials while separating molecules and to overcome the limitations of sample handling in such a low volume. Tagging molecules also allow preamplification of materials, which splits the sample into aliquots for material-specific library construction.
Different molecular tags applied to a monomeric cellular material can provide multimodal information that a single material includes in a cell. For investigating other information in 5’ or 3’ regions of RNA, such as B cell or T cell receptor sequences, separate PCR assays using different primers targeting template switching oligo (TSO) at 5’ (Picelli et al., 2014) or oligo-dT at 3’ provide region-specific single-cell libraries demonstrating cell subtype-specific isoforms (Hu et al., 2020). For comparison between cytosolic RNA and nuclear RNA, the microfluidic vessel retains the nucleus and passes lysed cytoplasm through the channel, allowing separate library construction and revealing regulatory networks in RNA processing (Abdelmoez et al., 2018). Among the information extracted from gDNA, researchers can simultaneously capture chromatin structure with DNA methylation in a single cell. After crosslinking, enzyme digestion, and ligation in 3C or Hi-C methods to maintain chromatin conformation, consecutive bisulfite conversion provides DNA methylation profiles in the sequencing of the cytosine to thymine ratio in scMethyl-HiC (Li et al., 2019) and sn-m3c-seq (Lee et al., 2019; Li et al., 2019). Chromatin accessibility and DNA methylation are obtained by GpC methylation followed by bisulfite conversion in iscCOOL-seq (Gu et al., 2019). RETrace provides a method to digest DNA differentially by region-specific enzymes, and the fragments are separately captured by unique adapters (Wei and Zhang, 2020). CUT&Tag2for1 uses Tn5-conjugated secondary antibodies targeting primary antibodies against both active and repressive marks in chromatin, providing both active and repressive states from a DNA region in a cell (Janssens et al., 2022).
Targeted amplification of specific regions in another material (such as gDNA) is relatively well merged with conventional single-cell RNA-seq. In TARGET-seq, the mutation hotspot in a cancerous cell is amplified by site-specific adapters in the portion of gDNA, providing genetic tumor heterogeneity of mutant and nonmutant cells (Rodriguez-Meira et al., 2019). In sc(Insert + RNA), inserted lineage barcodes in gDNA are amplified by site-specific adapters followed by consecutive cDNA amplification, validating lineage commitments at fate branching points in developing organoids (Kim et al., 2020). Poly-dT-mediated cDNA/second-strand synthesis followed by preamplification allows the splitting of samples into small aliquots (10%-20%) that generate an additional target-specific library. For monitoring TF-binding sites, scDAM&T-seq uses DNA methyltransferase (DAM) fused to an antibody against TFs (Rooijers et al., 2019), and it records sequence-specific (GmATC) modification around the TF-binding site that is further recognized by a specific adapter identifying the site of DpnI digestion (cutting GmATC but not GATC). Targeted extraction is also applied to verify the guide RNA identity of CRISPR/Cas9 experiments with chromatin accessibility in Perturb-ATAC (Rubin et al., 2019).
To leverage epigenomic states recorded in gDNA and the transcriptome without isolation of RNA or nuclei, Tn5 transposition inserts unique adapters into double-stranded DNA, thereby distinguishing DNA from single-stranded RNA. Thus, gDNA is tagged by Tn5, and the poly-dT adapter accompanies mRNA for tagging. In sci-CAR-seq (Cao et al., 2018), the Split&Pooling method uses an oligo-dT-adapter bearing a well-specific barcode for mRNA indexing and a Tn5 transposase bearing the same barcode for gDNA indexing. Second-strand synthesis of cDNA allows split samples for separated library construction. In SNARE-seq, a nucleus is captured in a droplet after Tn5-transposition, and subsequent poly-T mediated RT renders successful incorporation of a unique molecular adapter to gDNA and mRNA (Chen et al., 2019). Gel beads coated with counter adapters capture each molecular tag. Although the fundamental function of Tn5 is not to tag single-stranded nucleic acids, such as mRNA, researchers have found that Tn5 can insert an adapter into cDNA/RNA (Di et al., 2020; Lu et al., 2020). In ISSAAC-seq (Xu et al., 2022), applying Tn5 with unique adapters generates gDNA tags in the mixture with mRNA, and using another set of Tn5 with different adapters provides an mRNA tag by inserting the adapter into the mRNA/cDNA hybrid after RT. These results provide the dynamics of the covariance of chromatin accessibility and transcription across heterogeneous tissues, such as the kidney or cerebral cortex. To assess DNA methylation and chromatin accessibility with the transcriptome, snmCAT-seq uses me-dCTP to generate fully methylated double-stranded cDNA in a lysed nucleus with the treatment of GpC methyltransferase (Luo et al., 2022). After bisulfite treatment, the cytosine to thymine ratio in CpG sequences discriminates the amplified RNA library (high) and DNA library (low). This joint profiling provides specific enrichment of genetic risk for neuropsychiatric traits in the postmortem human frontal cortex.
Tagging protein with the transcriptomeDue to limited materials in a cell and sequencing-based readouts of RNA, protein tags by nucleic acid adapters bearing specific barcodes enable single-cell multiomics for protein profiling. CITE-seq (Stoeckius et al., 2017) and REAP-seq (Peterson et al., 2017) introduce a protein-specific antibody conjugated to poly-A oligonucleotides containing a barcode for antibody identification. After the interaction of poly-A-conjugated antibodies with surface proteins in bulk cells, a cell with bound antibodies is isolated in a droplet, and a 10× Genomics 3’ RNA bead coated by barcoded oligo-dT simultaneously captures the poly-A conjugated antibody and poly-A tails of mRNA. inCITE-seq uses fixation and permeabilization on the nucleus, allowing oligo-conjugated antibodies to penetrate intranuclear proteins (Chung et al., 2021), which has been used to understand how a combination of TFs configures gene expression in the mouse brain. Additionally, in RAID (Gerlach et al., 2019), mild fixation and permeabilization are simultaneously applied to an intact cell to provide the joint profiles of cell surface proteins and intracellular proteins. Instead of 10× Genomics 3’ RNA beads, the 10× Genomics 5P/V(D)J kit provides RNA beads containing TSO sequences (GGG) to capture the terminal transferase-mediated cytosine repeat (CCC) sequences at the 5’ end of the transcriptome. In ECCITE-seq (Mimitou et al., 2019), a cell stained with antibodies conjugated to CCC-containing oligos is applied to a droplet, and the TSO reaction with RT using both poly-T primers and custom RT primers targets the guide RNA of the CRISPR/Cas9 system. This technique has been used to provide surface protein expression, 5’ end-specific transcriptome, and T cell receptor sequences with guide RNA identity of CRISPR perturbation analysis in human peripheral blood mononuclear cells (PBMCs).
Tagging protein with the epigenomeTo record chromatin accessibility with protein profiling, Tn5 transposition is performed in permeabilized cells after in situ antibody staining. In ICICLE-seq, a cell stained by a poly-A sequence-conjugated antibody is permeabilized by digitonin followed by tagmentation with by customized Tn5-containing poly-A adapters (Swanson et al., 2021). Antibodies and gDNA are captured by 10× Genomics 3’ RNA beads coated with barcoded poly-T. Similarly, ASAP-seq uses oligo-conjugated antibodies to stain cells with the modification that adds bridging oligos to render conjugated oligos comparable to 10× Genomics single-cell ATAC beads (Mimitou et al., 2021). In addition, fixation and permeabilization before Tn5 tagmentation in this approach allow for retaining mitochondrial DNA for additional analysis, such as single-cell lineage tracing by mutations. Although oligo-conjugated antibody panels are increasingly used to detect various epitopes, scaling these panels for protein profiling is challenging. The recently developed PHAGE-ATAC adopts phage display processes rendering epitopes of interest on the outside of bacteriophages (Fiskin et al., 2022). Inserting a gene encoding a nanobody recognizing an epitope and unique adapters into a gene for phage coat protein allows engineering of a phagemid library against all proteins of interest. The unique adapters encoding nanobody identities are then captured by 10× Genomics beads.
Profiles of protein, DNA, and RNARecently developed gel beads bearing barcoded oligo-dT and Tn5 adapters (10× Genomics) increase the capacity to capture different molecules without competing for binding to identical adapters on the bead among molecules. This platform allows for the simultaneous capture of the majority of materials bearing cellular information, including DNA, RNA, and protein. TEA-seq (Swanson et al., 2021), which was developed from ICICLE-seq, uses commercial Tn5 with unique adapters that are distinguished from oligo-dT-captured RNA and poly-A-conjugated antibody identity by the 10× Genomics multiome platform. DOGMA-seq (Mimitou et al., 2021), which was developed from ASAP-seq, also uses the same platform to capture mRNA with Tn5 transposition and protein of interest, with clone tracing by mitochondrial mutation. DOGMA-seq has been used to reveal regulatory networks in chromatin, RNA, and surface proteins during hematopoietic differentiation and PBMC stimulation. A similar method, NEAT-seq, improves the detection of nuclear proteins leveraging nonspecific binding of oligos by adding E. coli ssDNA-binding proteins to reduce backgrounds (Chen et al., 2022). This approach profiles subsets of CD4 memory T cells by TFs with regulatory activity through transcription, translation, and chromatin accessibility.
In single-cell multiomics technologies using antibody-derived tags (ADTs), fixation and permeabilization for capturing surface protein profiles with Tn5 transposition increase cell doublets beyond filtering in silico. To this end, cell hashing through unique barcodes is performed by adding hashtag oligo (HTO)-conjugated antibodies targeting universal cell surface antigens, such as glycoproteins (Chen et al., 2022; Mimitou et al., 2019, 2021; Swanson et al., 2021). Filtering out multiple HTOs per cell reduces doublet-mediated backgrounds in fixation. Conversely, the cell hashtag enables sample multiplexing by distinguishing multiplexed samples. In scifi-RNA-seq (Datlinger et al., 2021), permeabilized cells or nuclei on different wells are preindexed with barcoded oligo-dT primers by RT. The overloading of cells per droplet is then computationally demultiplexed to individual cells by the barcode. Instead of oligo-conjugated antibodies, lipid- and cholesterol-modified oligonucleotides (LMOs and CMOs) incorporate barcodes into the plasma membrane for cell hashing in MULTI-seq (McGinnis et al., 2019). The various combinations between streptavidin-conjugated oligo-dT barcodes and biotin-conjugated concanavalin A beads massively increase the cell hashing capacity in CASB method (Fang et al., 2021). These approaches allow thousand-sample multiplexing, which, in theory, reduces costs.
Applying unique cell barcodes to the initiation of developing systems allows for tracing the origins of cells, particularly in the system of embryo development and disease. Integration of random sequences or curated semirandom sequences theoretically provides identifiers for cells with their clones at the sampling time (Bhang et al., 2015). Clone identifiers help distinguish amplified or diminished cells (clones) from others at the starting points when selective pressure is applied, such as lineage commitment, cancer metastasis, and reoccurrence of cancer. This approach helps to elucidate when a cell decides its fate in development and which cancer cells survive after drug treatment (Eyler et al., 2020; Kim et al., 2020; Weinreb et al., 2020). To record timely resolved cell lineages, introducing the sequence-evolving barcode into cells provides more capacity to record cells and their lineages in various time windows. Insertion and deletion mutants of CRISPR/Cas9-mediated target sequences provide accumulation of progressed mutations over time before the ruins of the PAM (protospacer adjacent motif) sequence. Based on barcode similarity and mutation hierarchy, these methods recapitulate relationships among different tissues and the ancestry of germ layers (extraembryonic and embryonic endoderm), providing a complete reconstruction of cell lineages (Chan et al., 2019; Frieda et al., 2017; Raj et al., 2018). Recently, these techniques have been advanced with terminal transferase and prime-editing methods to avoid PAM-sequence dependency in the recording (Choi et al., 2022; Loveless et al., 2021). As a limitation of use in live cells, acquired mutations in genomic or mitochondrial DNA (Mimitou et al., 2021; Park et al., 2021; Swanson et al., 2021) provide a distinct barcode identity and allow tracing of cells and clones in the development and disease of humans. Innate variable sequences will be another source of cell barcodes to reveal the identity of cell lineages that appear in response to the environment (Zhang et al., 2018).
Most cell states responding from and reporting to the environment are recorded in various cellular molecules, such as DNA, RNA, and proteins. Thus, reconstructing cellular states responding to biological events, such as development or diseases, requires extracting all the materials containing that information. To this end, single-cell barcoding technologies are rapidly updated and improved to capture multimodal data from a cell. Using the molecular identity of cellular materials, separating DNA or RNA allows the construction of both libraries simultaneously. Without physical separation, molecular tagging with single-cell barcodes expands target materials to multiple at a time. Furthermore, the cell hashing strategy incorporates hashtag barcodes into cell groups, thereby extending throughputs in complex samples. However, in current methods, there is still a need for materials containing the necessary information. Naked DNA, RNA, and proteins do not include timely progressed cell states. Examples of this include the following modifications: phosphorylation or methylation of proteins; noncoding RNAs with no poly-A tails; metabolite or ion exchanges in neuronal responses; and cell-to-cell communication by secreted molecules. It is apparent that abnormal signs of these events frequently represent a cell in diseases, such as cancer. Although recent advances in spatial transcriptomics adopt methods developed with imaging technology, their own limitations should be solved (Eng et al., 2019; Rodriques et al., 2019; Williams et al., 2022). Therefore, breakthroughs in single-cell barcoding technology will be required to analyze the actual cellular states representing biological systems and raise new ideas to solve fundamental questions in biology.
This work was supported by the National Research Foundation (NRF) of Korea (2021R1F1A104962311) and the Gachon University Research Fund (GCU-202102820001).
I.S.K. wrote the manuscript.
The author has no potential conflicts of interest to disclose.
. Summary of recently published single-cell multiomics studies.
Method | Genome | Transcript | Protein | Isolation method | Throughput | Capturing | Cell | Reference | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Genome | Epigenome1 | Epigenome2 | Whole cDNA | Partial cDNA | Surface | Intra-cellular | ||||||||
(Epi)genome profiling: DNA/RNA | ||||||||||||||
sc(ATAC+RNA) | ChromAcc (Tn5) | Full-length | FACS | Low (~96) | Poly-dT separation (biotin) | Intact | (Reyes et al., 2019) | |||||||
ASTAR-seq | ChromAcc (Tn5) | Full-length | Droplet, C1 | Mid (96×) | Poly-dT separation (biotin) | Intact | (Xing et al., 2020) | |||||||
SHARE-seq | ChromAcc (Tn5) | 3'-end | Split&Pool | High | Poly-dT separation (biotin) | Intact, Nucei | (Ma et. al., 2020) | |||||||
Paired-seq | ChromAcc (Tn5) | 3'-end | Split&Pool | High | Adapter ligation | Nucei | (Zhu et. al., 2019) | |||||||
sci-CAR | ChromAcc (Tn5) | 3'-end | Split&Pool | High | Material tagging, splitting | Nucei | (Cao et al., 2018) | |||||||
SNARE-seq | ChromAcc (Tn5) | 3'-end | Droplet, 10× | High | Material tagging, splitting | Nucei | (Chen et al., 2019) | |||||||
ISSAAC-seq | ChromAcc (Tn5) | 3'-end | Droplet, 10× | High | Material tagging, splitting | Nucei | (Xu et al., 2022) | |||||||
scCAT-seq | ChromAcc (Tn5) | Full-length | FACS | Low (~96) | Nucei separation (spin) | Intact | (Liu et al., 2019) | |||||||
scSIDR-seq | CNV | Full-length | FACS | Low (~48) | Nucei separation (ab/magnet) | Intact | (Han et al., 2018) | |||||||
scChaRM-seq | DNAme (WGBS) | ChromAcc (GpC) | 3'-end | Hand picking | Low (~96) | Poly-dT separation (biotin) | Intact | (Yan et al., 2021) | ||||||
scNMT-seq | DNAme (WGBS) | ChromAcc (GpC) | Full-length | FACS | Low (~96) | Poly-dT separation (biotin) | Intact | (Clark et al., 2018) | ||||||
scTrio-seq2 | CNV | DNAme (RRBS) | 3'-end | Hand picking | Low (~96) | Nucei separation (magnet) | Intact | (Bian et al., 2018) | ||||||
scNOMeRe-seq | DNAme (WGBS) | ChromAcc (GpC) | Full-length | Hand picking | Low (~96) | Nucei separation (magnet) | Intact | (Wang et al., 2021) | ||||||
snmCAT-seq | DNAme (WGBS) | ChromAcc (GpC) | Full-length | FACS | Low (~384) | Material tagging (C, mC) | Nucei | (Luo et al., 2022) | ||||||
Targeted amplification | ||||||||||||||
scDAM&T-seq | TF/DNA binding | 3'-end | FACS | Low (~384) | Material tagging | Intact | (Rooijers et al., 2019) | |||||||
TARGET-seq | Mutation | Full-length | FACS | Low (~384) | Material tagging, splitting | Intact | (Rodriguez-Meira et al., 2019) | |||||||
sc(Insert + RNA) | Targeted | 3'-end | FACS | Low (~384) | Material tagging, splitting | Intact | (Kim et al., 2020) | |||||||
Perturb-ATAC | ChromAcc (Tn5) | gRNA | Droplet, C1 | Mid (96×) | Material tagging, splitting | Intact | (Rubin et al., 2019) | |||||||
Multimodal information in a molecule | ||||||||||||||
scMethyl-HiC | DNAme (WGBS) | Hi-C | FACS | Low (~96) | DNA (multimodality) | Nucei | (Li et al., 2019) | |||||||
RETrace | Micro-satellite | DNAme (RRBS) | FACS | Low (~96) | DNA (multimodality) | Intact | (Wei and Zhang, 2020) | |||||||
sn-m3C-seq | DNAme (WGBS) | Hi-C (m3C) | FACS | Low (~384) | DNA (multimodality) | Nucei | (Lee et al., 2019) | |||||||
iscCOOL-seq | DNAme (WGBS) | ChromAcc (GpC) | Hand picking | Low (~96) | DNA (multimodality) | Intact | (Gu et al., 2019) | |||||||
CUT&Tag2for1 | Chrom_Act, | Chrom_Rep | Sorter | Mid (~5,184) | DNA (multimodality) | Nucei | (Janssens et al., 2022) | |||||||
scRCAT-seq | 5'-end, 3'-end | Hand picking | Low (~96) | RNA (multimodality) | Intact | (Hu et al., 2020) | ||||||||
SINC-seq | cytRNA, nucRNA | Microfluidic device | High | RNA (multimodality) | Intact | (Abdelmoez et al., 2018) | ||||||||
Protein profiling with RNA | ||||||||||||||
REAP-seq | 3'-end | Surface | Droplet, 10× | High | Material tagging | Intact | (Peterson et al., 2017) | |||||||
CITE-seq | 3'-end | Surface | Droplet, 10× | High | Material tagging | Intact | (Stoeckius et al., 2017) | |||||||
inCITE-seq | 3'-end | Intra-nuclear | Droplet, 10× | High | Material tagging | Nucei | (Chung et al., 2021) | |||||||
RAID | 3'-end | Surface | Intra | FACS | Low (~384) | Material tagging | Intact | (Gerlach et al., 2019) | ||||||
ECCITE-seq | gRNA + TCR | 5'-end | Surface + cell hashing | Droplet, 10× | High | Material tagging | Intact | (Mimitou et al., 2019) | ||||||
scSTAP | Full-length | Surface | Intra | Hand picking | Low (~96) | Material tagging, splitting | Intact | (Jiang et al., 2022) | ||||||
Protein profiling with chromatin | ||||||||||||||
PHAGE-ATAC | Mitochondria | ChromAcc (Tn5) | Surface | Intra | Droplet, 10× | High | Material tagging | Intact | (Fiskin et al., 2022) | |||||
ASAP-seq | Mitochondria | ChromAcc (Tn5) | Surface + cell hashing | Droplet, 10× | High | Material tagging | Intact | (Mimitou et al., 2021) | ||||||
ICICLE-seq | ChromAcc (Tn5) | Surface | Droplet, 10× | High | Material tagging | Intact | (Swanson et al., 2021) | |||||||
DNA/RNA/protein | ||||||||||||||
DOGMA-seq | Mitochondria | ChromAcc (Tn5) | 3'-end | Surface + cell hashing | Droplet, 10× | High | Material tagging | Intact | (Mimitou et al., 2021) | |||||
TEA-seq | ChromAcc (Tn5) | 3'-end | Surface | Droplet, 10× | High | Material tagging | Intact | (Swanson et al., 2021) | ||||||
NEAT-seq | ChromAcc (Tn5) | 3'-end | Surface + cell hashing | Intra | Droplet, 10× | High | Material tagging | Intact | (Chen et al., 2022) |