Mol. Cells 2023; 46(2): 86-98
Published online February 27, 2023
https://doi.org/10.14348/molcells.2023.0013
© The Korean Society for Molecular and Cellular Biology
Correspondence to : eaststar0@gmail.com
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.
The genome is almost identical in all the cells of the body. However, the functions and morphologies of each cell are different, and the factors that determine them are the genes and proteins expressed in the cells. Over the past decades, studies on epigenetic information, such as DNA methylation, histone modifications, chromatin accessibility, and chromatin conformation have shown that these properties play a fundamental role in gene regulation. Furthermore, various diseases such as cancer have been found to be associated with epigenetic mechanisms. In this study, we summarized the biological properties of epigenetics and single-cell epigenomic profiling techniques, and discussed future challenges in the field of epigenetics.
Keywords 3D chromatin structure, DNA methylation, histone modification, RNA modification, single-cell epigenomics
In 1942, Conrad Waddington defined “epigenetics” as a change in phenotype without a change in genotype (Waddington, 1942). Based on the current understanding from several studies, epigenetics is that the total DNA content in somatic cells is the same and there is no change in the existing DNA sequence, whereas the inheritance of gene expression patterns markedly varies among various cell types depending on changes in the chromatin state (Felsenfeld, 2014). Epigenetic mechanisms, in addition to DNA templates, can affect gene regulation during and after transcription and translation (Halušková, 2010). These properties can be profiled by various sequencing techniques. In addition, recent single-cell based profiling methods provide an opportunity to identify cell-to-cell variability. In this review, we provide a general overview of representative epigenetic phenomena, such as DNA methylation and histone modification, and epigenome profiling methods.
In eukaryotic cells, DNA is surrounded by chromatin in the nucleus (Jenuwein and Allis, 2001). Eukaryotic chromatin is a highly condensed that is structure essential for basic nuclear processes such as transcription and replication. Chromatin is divided into weakly and strongly condensed regions. The weakly condensed region is a region called euchromatin where transcription is generally active, shereas the strongly condensed region is a region called heterochromatin where gene expression is restricted (Back, 1976). The nucleosome is the basic unit of chromatin and consists of approximately 147 DNA base pairs and a histone octamer with two subunits H2A, H2B, H3, and H4 (Jenuwein and Allis, 2001). These histone tails of chromatin can be modified post-translationally by acetylation, methylation, and phosphorylation (Strahl and Allis, 2000) (Fig. 1). These post-translational covalent modifications are accumulated by epigenetic mechanisms and can alter the chromatin state and subsequent gene expression (Kouzarides, 2007) (Fig. 2). Histone modifications are mainly profiled using chromatin immunoprecipitation followed by sequencing (ChIP-seq) (Barski et al., 2007; Mikkelsen et al., 2007). In recent studies, histone modification information has been analyzed using Tn5 transposase-mediated tagmentation techniques such as cleavage under target and tagmentation (CUT&TAG) with few cells (Kaya-Okur et al., 2019) (Table 1).
Histone acetylation is the process by which the acetyl group of acetyl-CoA is transferred to the NH3+ group of the histone lysine. The addition of an acetyl group neutralizes the positive state of lysine and reduces the binding of histone proteins to DNA, making the DNA open and accessible to transcription factors (Bannister and Kouzarides, 2011). Histone hyperacetylation is a hallmark of transcriptional activity (Clayton et al., 1993; Pogo et al., 1966). The balance of histone acetyltransferases (HATs) and histone deacetylases (HDACs) is an important factor in the regulation of histone acetylation. HATs can be divided into two classes based on their subcellular localization and function (Parthun, 2007). A-type HATs are nuclear enzymes involved in the regulation of gene expression through the acetylation of nucleosomal histones in the chromatin context. B-type HATs are located in the cytoplasm and are responsible for acetylating newly synthesized histones prior to their transport from the cytoplasm to the nucleus where they assemble into nucleosomes (Roth et al., 2001) (Table 2).
HDACs oppose the effects of HATs and reverse lysine acetylation. In humans, there are four distinct classes of 18 HDACs: class I (Rpd3-like) (HDAC1-3 and HDAC8) and class II (Hda1-like) (HDAC4-7, HDAC9, and HDAC10), class III NAD-dependent enzymes of (Sir2-like) (SIRT1-7), and class IV for the single-member HDAC11 (Yang and Seto, 2008). In general, they are involved in multiple signaling pathways, and have relatively low specificity for particular acetyl groups, allowing a single enzyme to deacetylate multiple sites within histones. Various HDAC inhibitors have been developed and used to treat tumors. They can also induce cancer cell cycle arrest, differentiation, and apoptosis (Bose et al., 2014; Suraweera et al., 2018) (Table 2).
Phosphorylation is one of the most common post-translational modifications that occur on serine, threonine, and tyrosine residues of histone proteins (North et al., 2014). Similar to histone acetylation, this modification is highly dynamic and its levels are regulated by the addition and removal phosphate groups (Oki et al., 2007). Phosphoryl groups are transferred from ATP to the hydroxyl groups of amino acids by kinases, thereby adding a negative charge to histone proteins. In mammalian cells, H3S10 phosphorylation, which is mediated by Aurora -B kinase, is essential for mitosis and meiosis (Wei et al., 1998). This is because H3S10 phosphorylation dissociates the HP1 protein, which contributes to heterochromatin formation recruited by H3K9me3 in the interphase, from chromatin and prevents the formation of condensed heterochromatin. In addition, phosphorylation of the 139th serine group of H2AX is a histone modification induced by ATM and ATR. These modifications are involved in various DNA damage response pathways, including non-homologous end joining and homologous recombination (Cheung et al., 2005; Downs et al., 2004).
Histone methylation usually occurs on the side chains of lysine and arginine and is one of the most important post-transcriptional modifications. Unlike acetylation and phosphorylation, histone methylation does not affect the charge on histone proteins. Histone lysine residues can be mono-, di-, or tri-methylated, while arginine residues can be mono- or di-methylated (Lan and Shi, 2009; Ng et al., 2009). Most histone lysine methyltransferases (HKMTs) have an evolutionarily well-conserved sequence motif identified in
H3K9 methylation is associated with heterochromatin. The methylation of H3K9 is driven by SUV39H1/2, G9a, G9a-like protein (GLP), and SETDB1 (Fritsch et al., 2010). SETDB1 co-translationally catalyzes mono- and dimethylation when H3K9 binds to ribosomes. G9a and GLP can form homomeric and heteromeric complexes and are involved in the formation of H3K9me1 and H3K9me2 (Tachibana et al., 2002). SUV39H1/2 is a key enzyme for H3K9me3 in the pericentromeric heterochromatin (Rea et al., 2000). H3K4 methylation is enriched in enhancer regions, promoter regions, and transcription start sites and is usually associated with the transcriptional activation of nearby genes. H3K4me1 and H3K4me3 are distributed in the enhancer and promoter regions, respectively, and H3K4me2 is distributed in the replication origin region and the 5’ end of genes (Heintzman et al., 2007; Kim and Buratowski, 2009; Santos-Rosa et al., 2002). Their distribution locations are related to their functions. Set1, a H3K4 methyltransferase identified in yeast, operates in COMPASS, a complex protein related to Set1 (Briggs et al., 2001; Miller et al., 2001). This enzyme is well conserved from yeast to humans and is responsible for H3K4 methylation. H3K27 methylation is induced by PRC2, a complex of polycomb group proteins. PRC2 has four core subunits (Ezh2, Suz12, EED, and RbAP46/48), and EZh2 catalyzes methylation (Cao et al., 2002; Kuzmichev et al., 2002). H3K27me2 and H3K27me3 are the hallmarks of gene repression (Banaszynski et al., 2013; Barski et al., 2007) (Fig. 2). Conversely, H3K27me1 is associated with transcriptional promotion and is distributed in the promoter regions of the active genes (Barski et al., 2007). H3K36 and H3K79 methylation are considered to be active markers distributed in actively transcribed chromatin regions. H3K36 can be mono- and demethylated by NSD1-3 and ASH1L, and trimethylation is catalyzed by SETD2 (Eram et al., 2014). H3K36 methylation can inhibit the activity of PRC2, preventing H3K27 methylation catalyzed by PRC2 (Yuan et al., 2011). In mammals, H3K79 is methylated by DOT1L. Unlike other HKMTs, DOT1L does not contain a SET domain because it is located in a globular histone core that is difficult for H3K79 to access (Jones et al., 2008) (Table 2).
Aberrant histone modifications can affect chromosomal segregation or abnormal regulation of oncogenes and/or tumor suppressor genes (Table 2). Histone modification, which is frequently observed in cancer cells, is the loss of HDAC-mediated acetylation (Li and Seto, 2016). SIRT1 and deacetylase activities are upregulated in various tumor types. In addition to the loss of acetylation, gene silencing due to increased H3K27me3 due to EZH2 overexpression has been implicated in the progression of several solid malignancies, such as breast cancer (Kleer et al., 2003). Overexpression of lncRNA HOTAIR in ovarian cancer recruits EZH2 and alters the H3K27me3 landscape (Dai et al., 2021). It has also been reported that histone demethylase lysine-specific demethylase 4 A promotes the progression of nasopharyngeal carcinoma by promoting hypoxia-inducible factor-1α expression (Zhao et al., 2021). Aberrant histone modifications are closely related to cancer, and histone-modifying enzyme inhibitors such as HDAC inhibitors are being actively studied and developed.
DNA methylation was first reported in 1948, and its role in gene regulation was first suggested in the mid-1970s (Holliday and Pugh, 1975; Hotchkiss, 1948). In mammalian DNA, there are many 5-methylcytosine (5mC) in sequences where cytosine and guanine are connected in the 5' to 3' direction (Coulondre et al., 1978). Most DNA methylations are stable and play important roles during development and the cell cycle in several epigenetic processes, such as genomic imprinting and X-chromosome inactivation (Smith and Meissner, 2013). DNA methylation plays an important role in controlling the epigenetic environment during reprogramming of somatic cells into pluripotent stem cells (Lee et al., 2014), which is mediated by DNA methyltransferase enzymes (DNMTs) (Table 2). 5mC is mainly profiled using whole-genome bisulfite sequencing (BS-seq) (Cokus et al., 2008). In addition, techniques such as reduced representation bisulfite sequencing (RRBS) and methylated DNA immunoprecipitation sequencing (MeDIP-seq) have been developed to profile the DNA methylation information (Meissner et al., 2005; Weber et al., 2005) (Table 1).
In the mammalian genome, CpG islands are DNA sequences approximately 1000 base pairs in length with a higher CpG density than the rest of the genome (Bird et al., 1985). Most gene promoters (more than 2/3), particularly promoters of housekeeping genes, are embedded in CpG islands (Saxonov et al., 2006). In general, CpG sites in promoter regions within CpG islands are not methylated differently from the other CpG sites (Bird et al., 1985). Silencing of gene expression by methylation of promoter DNA is stable and long-lasting, unlike transcriptional repression by histone modifications (Mohn et al., 2008). Because of these properties, methylation of CpG islands is an important epigenetic mechanism that regulates imprinted genes and gene expression during development and differentiation. Unlike promoter methylation, gene body DNA methylation is positively correlated with gene expression and is considered a characteristic of transcribed genes in cells (Ball et al., 2009). CpG islands within genes or gene bodies can reveal specific methylation differences between tissues or cancer samples. However, CpG islands in promoter regions show little difference in methylation and specific differences at some distance from the CpG islands (called CpG island shores) (Irizarry et al., 2009).
DNA methylation consists of three stages: de novo DNA methylation, maintenance, and demethylation (Fig. 3). DNMT3A and DNMT3B, known as de novo methyltransferases, consist of three main domains: the Pro-Trp-Trp-Pro (PWWP), ATRX-DNMT3-DNMT3L (ADD), and MTase domains (Okano et al., 1998; 1999). The ADD domain binds to the unmodified K4 residue of the H3 tail, until it binds to the MTase domain and acts as an inhibitor (Guo et al., 2015). When the ADD domain binds to unmodified H3K4, the ADD domain unbinds from the MTase domain and enables DNA methylation (Otani et al., 2009). As the ADD domain is repelled by the increasing number of methyl residues in H3K4, DNA methylation does not normally occur in CpG-rich promoters of H3K4me3 enriched genes (Piunti and Shilatifard, 2016). As transcription proceeds, the PWWP domain binds to H3K36me3 generated by the histone methyltransferase SETD2 to induce DNA methylation in the gene body (Dhayalan et al., 2010). Among de novo methylations, only symmetrical CpG methylation is maintained during DNA replication. This depends on the activity of DNMT1 and UHRF1, an E3 ubiquitin-protein ligase with five conserved domains (Arita et al., 2012). It works together with the TTD and PHD domains of UHRF1 to recognize and bind to H3K9me3, thereby loading DNMT1 onto the newly synthesized DNA substrate (Karg et al., 2017). Therefore, DNMT1 is located on the replication pork where newly synthesized hemimethylated DNA is formed during DNA replication (Leonhardt et al., 1992). DNMT1 is called maintenance DNMT because it maintains the pattern of DNA methylation. DNA methylation can be erased by passive or active demethylation. Passive demethylation leads to replication-dependent dilution of 5mC due to defects in the maintenance methylation mechanism that copies methylation patterns during DNA replication (Howell et al., 2001). Passive demethylation is biologically important for erasing DNA methylation in preimplantation embryos and primordial germ cells. Active DNA demethylation mechanisms have been found to be mediated by Tet1, Tet2, and Tet3 enzymes. The TET protein oxidizes 5mC to produce 5-hydroxymethylcytosine (5hmC) and undergoes demethylation (Ito et al., 2010; Tahiliani et al., 2009). 5hmC can be further oxidized to generate 5fC and 5caC, which can be removed and converted to an unmethylated cytosine by the DNA glycosylase (TDG and SMUG1) and base excision repair pathways, respectively (Maiti and Drohat, 2011; Weber et al., 2016) (Table 2).
DNA methylation and histone modification work together to regulate transcription. Because DNMTs generally suppress gene regions through methylation, they inhibit gene expression by interacting with enzymes that regulate histone modification. DNMT1 and DNMT3a restrict gene expression by binding to SUV39H1, an enzyme that methylates H3K9 (Fuks et al., 2000). In addition, DNMT1 and DNMT3b bind to HDACs and regulate gene expression (Fuks et al., 2000; Geiman et al., 2004). Methyl-binding proteins, such as MeCP2 and UHRF, enhance gene repression by interacting with methylated DNA and histones (Nan et al., 1998). Although the interaction between RNA and DNA modifications is not clearly defined, recent studies have reported that high N6-methyl adenosine (m6A) modifications in esophageal squamous cell carcinoma cells lead to DNA demethylation, altering chromatin accessibility and affecting gene transcription (Deng et al., 2022).
DNA methylation is involved in various diseases such as brain disorders and cancer (Table 2). For example, an autosomal-dominant mutation in the N-terminal regulatory domain of DNMT1 was identified in patients with hereditary sensory and autonomic neuropathy type 1, who presented with dementia, hearing loss, and narcolepsy in adulthood (Klein et al., 2011). DNMT3A is associated with the development of acute myeloid leukemia (AML), and TET2 mutations are considered a common epigenetic marker in several hematological malignancies including AML, chronic myelomonocytic leukemia, and lymphomas {Langemeijer, 2009, Acquired mutations in TET2 are common in myelodysplastic syndromes}(Langemeijer et al., 2009; Spencer et al., 2017). Overexpression of UHRF1 promotes hypermethylation of the promoter of thioredoxin-interacting protein (TXNIP), a tumor suppressor gene, and downregulates TXNIP expression in cervical cancer, contributing to carcinogenesis (Kim et al., 2021). The DNMT3A mutation is a missense mutation of Arginine 882 (R882) in the MTase domain, which interferes with DNMT3A formation and reduces the methylation activity of the enzyme (Russler-Germain et al., 2014). Conversely, TET2 mutations induce hypermethylation of enhancer regions in myeloid malignancies (Ko et al., 2010). It is still unclear how two mutants with opposite functions can achieve similar phenotypes. A recent study on patients with AML demonstrated that mutations in DNMT3A and TET2 can cause irregular DNA methylation patterns and transcriptional expression levels in genes known to be involved in AML pathogenesis (Ponciano-Gómez et al., 2017). In addition, Lee et al. (2023) reported that the regulation of C-Maf-inducing protein methylation mediated by DNMT1 and TET2 was correlated with nonalcoholic fatty liver disease.
In mammalian cells, DNA forms nucleosomes, which are arranged into higher-order chromatin structures that play important roles in regulating the cell cycle, replication, development and gene function. The three-dimensional chromatin structure includes cis-regulatory interactions such as enhancer-promoter interactions and repressive interactions such as lamina-associated domains (LADs) and is mediated by structural elements such as CCCTC-binding factor (CTCF) and cohesin (Huang et al., 2021; Rowley and Corces, 2018; Schoenfelder and Fraser, 2019).
Enhancers, a representative cis-regulatory element, are usually several kilobases different from gene promoters but can be involved in transcriptional regulation through reduced spatial proximity to promoters by forming chromatin looping and folding. Chromatin conformation capture (3C) profiles the interactions between these specific genomic regions (Dekker et al., 2002). A recent technique, Hi-C, allows global quantification of all interactions present in the nucleus. Based on the Hi-C data, it was possible to analyze compartments, which are sets of chromosomal regions with similar, long-range Hi-C contact patterns and self-interacting genomic regions called topologically associated domains (TADs) (Dixon et al., 2012; Lieberman-Aiden et al., 2009). Compartment A is associated with open chromatin, and Compartment B is associated with closed chromatin and is specific to the cell type (Dixon et al., 2012). Previous studies have reported that TADs are well conserved among various cell types and species. However, multiple methods based on Hi-C have been developed at the single-cell level, and recent studies have shown that chromatin contacts, such as TAD structures, vary considerably at the single-cell level (Stevens et al., 2017). In general, the frequency of interactions within TADs is high and the interaction between TADs is generally low. CTCF is an important factor in regulating TAD structure and is involved in the insulation between TADs (Szabo et al., 2020). The disruption of TAD boundaries can affect gene expression and is associated with various diseases and cancers. A recent study reported that the combination of genome profiling and CRISPR-Cas9 genome engineering could identify regions with repetitive changes in the 3D genome structure and predict oncogene activity (Xu et al., 2022). Nuclear lamina and constituent filament proteins, A/B type lamins, are located in the nuclear envelope. LADs are heterochromatic regions that constitute approximately 40% of the genome and are in contact with the nuclear lamina. The LAD region is enriched in H3K9me2/3 and H3K27me3, and a few genes within the LAD are expressed (Guelen et al., 2008; Lund et al., 2014).
Mammalian RNA undergoes numerous post-transcriptional chemical modifications. Although these RNA modifications were discovered in the 1970s, their importance has not yet been highlighted. Over the past decade, several studies have shown that RNA modifications play an important role in regulating gene expression, similar to the epigenetic modifications of DNA and histones (Helm and Motorin, 2017).
m6A is the most abundant endogenous RNA modification in eukaryotes. m6A is precisely controlled by three protein groups: “writers” that install m6A residues along target RNA transcripts, “erasers” that remove modifications at specific sites, and “readers” that recognize modified regions. m6A RNA methylation is catalyzed by “writer” proteins, m6A methyltransferase-like 3 and 14 (METTL3 and METTL14). METTL3 and METTL14 form stable heterodimer complexes (Bokar et al., 1997; Liu et al., 2014). In addition, the METTL3-METTL14 complex interacts with Wilms tumor 1-associated protein to target a wide range of RNA substrates and install m6A (Ping et al., 2014). m6A demethylation is catalyzed by the AlkB family of non-heme Fe(II)/α-KG-dependent dioxygenases. Representative major human AlkB family members include fat mass- and obesity-associated protein (FTO) and alkylation repair homolog protein 5 (ALKBH5). FTO is an RNA demethylase first discovered in mammalian cells that removes m6A by generating N6-hydroxymethyladenosine and N6-formyladenosine, which can be hydrolyzed into adenine through continuous oxidation over several hours (Fu et al., 2013). ALKBH5 catalyzes the m6A-to-A conversion by directly removing the methyl group (Jang et al., 2022; Zheng et al., 2013). The YT521-B homology (YTH) domain-containing protein family recognizes and binds to m6A. YTHDC1 is located in the nucleus and is characterized by various RNA splicing regulators, such as serine-arginine repeat proteins. It recognizes m6A on the lncRNA XIST and mediates X chromosome silencing (Patil et al., 2016; Zhang et al., 2010). YTHDF1 and YTHDF3 regulate translation efficiency by interacting with the ribosome of the target RNA, and YTHDF2 promotes the degradation of m6A-modified RNA by recruiting deadenylation enzyme complexes, thereby affecting RNA stability (Du et al., 2016; Li et al., 2017; Wang et al., 2015).
m6A modification is associated with various diseases and cancers. METTL3 promotes the maturation of mir-1246 by methylating pri-mir-1246 and down-regulating the anti-oncogene SPRED2, thereby improving the tumor metastasis ability of colorectal cancer (Peng et al., 2019). In addition, METTL3 is also associated with prostatitis and Aicardi syndrome, FTO is an oncogene in AML, breast cancer and colorectal cancer, and YTHDF1 and YTHDF2 are associated with pancreatic cancer and breast cancer, respectively (Huang et al., 2019; Liu and Pan, 2015; Pan et al., 2021).
Several techniques have been developed to profile epigenetic regulation (Table 1). However, although these bulk-cell profiling methods showed an average signal for a cell population, it was difficult to represent cell-to-cell variation within tissue, such as gene expression heterogeneity. Single-cell epigenomic analysis has the potential to overcome these limitations and to elucidate gene regulatory mechanisms across diverse cellular environments.
Single-cell ChIP-seq (scChIP-seq) was the first reported technique for profiling histone modifications at the single-cell level. scChIp-seq separates single cells into droplets containing lysis buffer and micrococcal nuclease (MNase), followed by barcoding before immunoprecipitation (Rotem et al., 2015). Since then, several technologies have been developed, such as single-cell chromatin immunocleavage followed by sequencing, chromatin integration labeling sequencing, single-cell cleavage under targets and release using nucleases (scCUT&RUN), and CUT&TAG (Bartosovic et al., 2021; Harada et al., 2019; Kaya-Okur et al., 2019; Ku et al., 2019). Several single-cell studies have confirmed that cell-to-cell variations in histone modifications are correlated with heterogeneity in gene expression. For example, measurements of H3K4me2 levels within mouse embryonic stem cell populations revealed significant variation, which was observed in gene enhancers and transcriptionally repressed genes (Rotem et al., 2015). In addition, integrating single-cell-level H3K4me3 profiling and scRNA-seq in the mouse brain confirmed cellular heterogeneity in oligodendrocyte populations, which appeared homogenous and revealed that they could be dispersed into subpopulations enriched with module-specific genes (Bartosovic et al., 2021).
Single-cell level profiling of DNA methylation has mainly been achieved through bisulfite conversion methods such as single-cell BS-seq (scBS-seq), single-cell RRBS, and post-bisulfite adapter-tagging (Guo et al., 2013; Miura et al., 2012; Smallwood et al., 2014). Bisulfite sequencing is based on the conversion of unmethylated cytosine into uracil. scRRBS generates DNA fragments containing CpG-rich ends by using one or more restriction enzymes, followed by bisulfite sequencing (Guo et al., 2013). scBS-seq was used to perform bisulfite treatment and random priming and extension after cell isolation and lysis (Smallwood et al., 2014). These methods show heterogeneity in DNA methylation in both mouse and human cells. Multi-omics techniques have also been developed to investigate the interaction between DNA methylation heterogeneity, gene expression heterogeneity and other epigenetic data (Table 3). For example, single-cell methylome and transcriptome sequencing (scM&T-seq) have demonstrated a functional link between transcriptional and DNA methylation heterogeneity of gene promoters within specific cell and tissue types in mouse muscle stem cells (Angermueller et al., 2016).
Chromatin accessibility plays an important role in the regulation of gene expression by influencing transcription initiation. Chromatin accessibility can be profiled by assays for transposase-accessible chromatin sequencing (ATAC-seq) and DNase I hypersensitive site sequencing (DNase-seq). These methods measure chromatin accessibility based on enzymatic sensitivity (Buenrostro et al., 2013; Song and Crawford, 2010). Single-cell ATAC-seq (scATAC-seq) and single-cell DNase-seq, similar to other single-cell epigenetic profiling techniques, can identify heterogeneity between cells using the chromatin approach. Cusanovich et al. (2015) appliedcombinatorial indexing to intact nuclei and were able to distinguish between various cell types by profiling thousands of cells. In addition, a recent study performed scATAC-seq and scRNA-seq on DNA and mRNA of the same cell, respectively, and found a positive correlation between gene expression and chromatin accessibility heterogeneity in each cell (Reyes et al., 2019).
Research on 3D genome structure has been greatly improved by bulk cell Hi-C, but cell-to-cell variability of 3D genome features such as TADs or enhancer-promoter contacts could not be accurately reflected. Single-cell Hi-C can be used to perform interaction profiling depending on the proximity ligation region. These techniques based on Hi-C have revealed that chromatin contact varies depending on cell type and developmental status. Using single-cell Hi-C, Nagano et al. (2017) found that mouse embryonic stem cells showed chromosomal condensation in the early G1 phase and extensive reorganization during replication, suggesting that the cell cycle is related to heterogeneity in chromatin contact. In addition, single-nucleus Hi-C removed the previously used biotin-filling step and performed sticky end ligation, revealing that the chromatin structure was uniquely reconstructed during the oocyte-to-zygote transition in mice (Flyamer et al., 2017) (Fig. 4).
Recently, multi-modal omics techniques that can simultaneously profile two or more epigenetic features in a single cell have facilitated the analysis of the correlations between different traits. For example, sn-m3C-seq can simultaneously profile DNA methylation and chromatin conformation at a single-cell level (Lee et al., 2019). In addition, Paired-seq and SNARE-seq can simultaneously profile gene expression and chromatin accessibility information, as well as scM&T-seq and snmC2T-seq can co-profile transcriptome and methylome information (Angermueller et al., 2016; Chen et al., 2019; Luo et al., 2022; Zhu et al., 2019). Many multi-modal omics techniques are still being developed, which will further advance our biological understanding (Table 3).
In this review, several characteristics such as DNA methylation and histone modification, which are representative of epigenetic characteristics, were summarized. Mostly, epigenetics focus on molecular biology approaches. However, even if it is not genetic, DNA or histone modifications and three-dimensional chromatin structure have a great influence on the changes that can occur within individual cells. Since the 1980s, scientific interest in the molecular mechanisms of epigenetic control to understand disease development and treatment has continued to grow. Despite numerous studies, several epigenetic mechanisms and patterns remain unknown. However, by integrating advanced techniques sophisticated algorithms, it is now possible to generate and analyze large volumes of epigenetic data. Single-cell multi-omics-based sequencing techniques that are being actively studied can generate information on two or more epigenetic traits. These epigenetic data can facilitate correlation analysis between different traits. Advances in these technologies and information will provide opportunities to establish novel epigenetic markers and their functions in different types of tissue, development, and disease states.
This work was supported by the 2021 Research Fund of the University of Seoul.
U.K. and D.S.L. wrote the manuscript.
The authors have no potential conflicts of interest to disclose.
Methods for profiling epigenetic traits
Data type | Methods | Reference |
---|---|---|
Histone modification | ChIP-seq | (Barski et al., 2007) |
CUT&TAG | (Kaya-Okur et al., 2019) | |
CUT&RUN | (Skene and Henikoff, 2017) | |
DNA methylation | BS-seq | (Cokus et al., 2008) |
RRBS | (Meissner et al., 2005) | |
MeDIP-seq | (Weber et al., 2005) | |
Chromatin contact | Hi-C | (Lieberman-Aiden et al., 2009) |
ChIA-PET | (Wei et al., 2006) | |
Hi-ChIP | (Mumbach et al., 2016) |
Epigenetic changes and associated proteins
Epigenetic modification | Associated protein | Function | Reference |
---|---|---|---|
Maintenance DNA methylation | DNMT1 | Maintain the pattern of DNA methylation | (Klein et al., 2011) |
De novo DNA methylation | DNMT3A, DNMT3B | Methylation of unmodified DNA | (Spencer et al., 2017) |
DNA demethylation | TET1, TET2, TET3 | Remove methylation | (Ponciano-Gómez et al., 2017) |
Histone acteylation | p300/CBP | Transcription activation | (Benton et al., 2017) |
Histone deacetylation | HDACs | Transcriptional regulation | (Li and Seto, 2016) |
H3K4 methylation | SET1 | Transcription activation | (Zaidi et al., 2013) |
H3K9 methylation | EZH2, G9a/GLP | Transcription repression | (Iacono et al., 2018) |
H3K27 methylation | PRC2 | Transcription repression (H3K27me1 is associated with transcription activiation) | (Huang et al., 2021) |
H3K79 methylation | DOT1L | Transcription activation | (Johnson et al., 2009) |
RNA methylation | METTL3, METTL14, WTAP | Installation of m6A on RNA residue | (Ma et al., 2017) |
YTHDF1/2/3, YTHDC1 | Binding with m6A | (Paris et al., 2019) | |
RNA demethylation | FTO, ALKBH5 | Remove of m6A on RNA residue | (Zhang et al., 2016) |
Multimodal omics techniques
Methods | Profiling features | Reference |
---|---|---|
DR-seq | Genome, transcriptome | (Dey et al., 2015) |
G&T-seq | Genome, transcriptome | (Macaulay et al., 2015) |
scTrio-seq | Genome, transcriptome | (Hou et al., 2016) |
SIDR-seq | Genome, transcriptome | (Han et al., 2018) |
CORTAD-seq | Genome, transcriptome | (Kong et al., 2019) |
TARGET-seq | Genome, transcriptome | (Rodriguez-Meira et al., 2019) |
PEA/STA | Transcriptome, proteome | (Genshaft et al., 2016) |
PLAYR | Transcriptome, proteome | (Frei et al., 2016) |
Abseq | Transcriptome, proteome | (Shahi et al., 2017) |
CITE-seq | Transcriptome, proteome | (Stoeckius et al., 2017) |
REAP-seq | Transcriptome, proteome | (Peterson et al., 2017) |
RAID | Transcriptome, proteome | (Gerlach et al., 2019) |
ECCITE-seq | Transcriptome, proteome | (Mimitou et al., 2019) |
sci-CAR | Chromatin accessibility, transcriptome | (Cao et al., 2018) |
T-ATAC-seq | Chromatin accessibility, transcriptome | (Satpathy et al., 2018) |
scCAT-seq | Chromatin accessibility, transcriptome | (Liu et al., 2019) |
SNARE-seq | Chromatin accessibility, transcriptome | (Chen et al., 2019) |
ATAC-RNA-seq | Chromatin accessibility, transcriptome | (Reyes et al., 2019) |
Paired-seq | Chromatin accessibility, transcriptome | (Zhu et al., 2019) |
scMT-seq | DNA methylome, transcriptome | (Hu et al., 2016) |
scM&T-seq | DNA methylome, transcriptome | (Angermueller et al., 2016) |
sc-GEM | DNA metyhlome, transcriptome | (Cheow et al., 2016) |
scNMT-seq | Chromatin accessibility, DNA methylome, transcriptome | (Clark et al., 2018) |
scChaRM-seq | Chromatin accessibility, DNA methylome, transcriptome | (Yan et al., 2021) |
scNOMeRe-seq | Chromatin accessibility, DNA methylome, transcriptome | (Wang et al., 2021) |
snmCAT-seq | Chromatin accessibility, DNA methylome, transcriptome | (Luo et al., 2022) |
DART-seq | Transcriptome, RNA methylome | (Meyer, 2019) |
ORCA | Transcriptome, chromatin conformation | (Mateo et al., 2019) |
Mol. Cells 2023; 46(2): 86-98
Published online February 28, 2023 https://doi.org/10.14348/molcells.2023.0013
Copyright © The Korean Society for Molecular and Cellular Biology.
Department of Life Science, University of Seoul, Seoul 02504, Korea
Correspondence to:eaststar0@gmail.com
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.
The genome is almost identical in all the cells of the body. However, the functions and morphologies of each cell are different, and the factors that determine them are the genes and proteins expressed in the cells. Over the past decades, studies on epigenetic information, such as DNA methylation, histone modifications, chromatin accessibility, and chromatin conformation have shown that these properties play a fundamental role in gene regulation. Furthermore, various diseases such as cancer have been found to be associated with epigenetic mechanisms. In this study, we summarized the biological properties of epigenetics and single-cell epigenomic profiling techniques, and discussed future challenges in the field of epigenetics.
Keywords: 3D chromatin structure, DNA methylation, histone modification, RNA modification, single-cell epigenomics
In 1942, Conrad Waddington defined “epigenetics” as a change in phenotype without a change in genotype (Waddington, 1942). Based on the current understanding from several studies, epigenetics is that the total DNA content in somatic cells is the same and there is no change in the existing DNA sequence, whereas the inheritance of gene expression patterns markedly varies among various cell types depending on changes in the chromatin state (Felsenfeld, 2014). Epigenetic mechanisms, in addition to DNA templates, can affect gene regulation during and after transcription and translation (Halušková, 2010). These properties can be profiled by various sequencing techniques. In addition, recent single-cell based profiling methods provide an opportunity to identify cell-to-cell variability. In this review, we provide a general overview of representative epigenetic phenomena, such as DNA methylation and histone modification, and epigenome profiling methods.
In eukaryotic cells, DNA is surrounded by chromatin in the nucleus (Jenuwein and Allis, 2001). Eukaryotic chromatin is a highly condensed that is structure essential for basic nuclear processes such as transcription and replication. Chromatin is divided into weakly and strongly condensed regions. The weakly condensed region is a region called euchromatin where transcription is generally active, shereas the strongly condensed region is a region called heterochromatin where gene expression is restricted (Back, 1976). The nucleosome is the basic unit of chromatin and consists of approximately 147 DNA base pairs and a histone octamer with two subunits H2A, H2B, H3, and H4 (Jenuwein and Allis, 2001). These histone tails of chromatin can be modified post-translationally by acetylation, methylation, and phosphorylation (Strahl and Allis, 2000) (Fig. 1). These post-translational covalent modifications are accumulated by epigenetic mechanisms and can alter the chromatin state and subsequent gene expression (Kouzarides, 2007) (Fig. 2). Histone modifications are mainly profiled using chromatin immunoprecipitation followed by sequencing (ChIP-seq) (Barski et al., 2007; Mikkelsen et al., 2007). In recent studies, histone modification information has been analyzed using Tn5 transposase-mediated tagmentation techniques such as cleavage under target and tagmentation (CUT&TAG) with few cells (Kaya-Okur et al., 2019) (Table 1).
Histone acetylation is the process by which the acetyl group of acetyl-CoA is transferred to the NH3+ group of the histone lysine. The addition of an acetyl group neutralizes the positive state of lysine and reduces the binding of histone proteins to DNA, making the DNA open and accessible to transcription factors (Bannister and Kouzarides, 2011). Histone hyperacetylation is a hallmark of transcriptional activity (Clayton et al., 1993; Pogo et al., 1966). The balance of histone acetyltransferases (HATs) and histone deacetylases (HDACs) is an important factor in the regulation of histone acetylation. HATs can be divided into two classes based on their subcellular localization and function (Parthun, 2007). A-type HATs are nuclear enzymes involved in the regulation of gene expression through the acetylation of nucleosomal histones in the chromatin context. B-type HATs are located in the cytoplasm and are responsible for acetylating newly synthesized histones prior to their transport from the cytoplasm to the nucleus where they assemble into nucleosomes (Roth et al., 2001) (Table 2).
HDACs oppose the effects of HATs and reverse lysine acetylation. In humans, there are four distinct classes of 18 HDACs: class I (Rpd3-like) (HDAC1-3 and HDAC8) and class II (Hda1-like) (HDAC4-7, HDAC9, and HDAC10), class III NAD-dependent enzymes of (Sir2-like) (SIRT1-7), and class IV for the single-member HDAC11 (Yang and Seto, 2008). In general, they are involved in multiple signaling pathways, and have relatively low specificity for particular acetyl groups, allowing a single enzyme to deacetylate multiple sites within histones. Various HDAC inhibitors have been developed and used to treat tumors. They can also induce cancer cell cycle arrest, differentiation, and apoptosis (Bose et al., 2014; Suraweera et al., 2018) (Table 2).
Phosphorylation is one of the most common post-translational modifications that occur on serine, threonine, and tyrosine residues of histone proteins (North et al., 2014). Similar to histone acetylation, this modification is highly dynamic and its levels are regulated by the addition and removal phosphate groups (Oki et al., 2007). Phosphoryl groups are transferred from ATP to the hydroxyl groups of amino acids by kinases, thereby adding a negative charge to histone proteins. In mammalian cells, H3S10 phosphorylation, which is mediated by Aurora -B kinase, is essential for mitosis and meiosis (Wei et al., 1998). This is because H3S10 phosphorylation dissociates the HP1 protein, which contributes to heterochromatin formation recruited by H3K9me3 in the interphase, from chromatin and prevents the formation of condensed heterochromatin. In addition, phosphorylation of the 139th serine group of H2AX is a histone modification induced by ATM and ATR. These modifications are involved in various DNA damage response pathways, including non-homologous end joining and homologous recombination (Cheung et al., 2005; Downs et al., 2004).
Histone methylation usually occurs on the side chains of lysine and arginine and is one of the most important post-transcriptional modifications. Unlike acetylation and phosphorylation, histone methylation does not affect the charge on histone proteins. Histone lysine residues can be mono-, di-, or tri-methylated, while arginine residues can be mono- or di-methylated (Lan and Shi, 2009; Ng et al., 2009). Most histone lysine methyltransferases (HKMTs) have an evolutionarily well-conserved sequence motif identified in
H3K9 methylation is associated with heterochromatin. The methylation of H3K9 is driven by SUV39H1/2, G9a, G9a-like protein (GLP), and SETDB1 (Fritsch et al., 2010). SETDB1 co-translationally catalyzes mono- and dimethylation when H3K9 binds to ribosomes. G9a and GLP can form homomeric and heteromeric complexes and are involved in the formation of H3K9me1 and H3K9me2 (Tachibana et al., 2002). SUV39H1/2 is a key enzyme for H3K9me3 in the pericentromeric heterochromatin (Rea et al., 2000). H3K4 methylation is enriched in enhancer regions, promoter regions, and transcription start sites and is usually associated with the transcriptional activation of nearby genes. H3K4me1 and H3K4me3 are distributed in the enhancer and promoter regions, respectively, and H3K4me2 is distributed in the replication origin region and the 5’ end of genes (Heintzman et al., 2007; Kim and Buratowski, 2009; Santos-Rosa et al., 2002). Their distribution locations are related to their functions. Set1, a H3K4 methyltransferase identified in yeast, operates in COMPASS, a complex protein related to Set1 (Briggs et al., 2001; Miller et al., 2001). This enzyme is well conserved from yeast to humans and is responsible for H3K4 methylation. H3K27 methylation is induced by PRC2, a complex of polycomb group proteins. PRC2 has four core subunits (Ezh2, Suz12, EED, and RbAP46/48), and EZh2 catalyzes methylation (Cao et al., 2002; Kuzmichev et al., 2002). H3K27me2 and H3K27me3 are the hallmarks of gene repression (Banaszynski et al., 2013; Barski et al., 2007) (Fig. 2). Conversely, H3K27me1 is associated with transcriptional promotion and is distributed in the promoter regions of the active genes (Barski et al., 2007). H3K36 and H3K79 methylation are considered to be active markers distributed in actively transcribed chromatin regions. H3K36 can be mono- and demethylated by NSD1-3 and ASH1L, and trimethylation is catalyzed by SETD2 (Eram et al., 2014). H3K36 methylation can inhibit the activity of PRC2, preventing H3K27 methylation catalyzed by PRC2 (Yuan et al., 2011). In mammals, H3K79 is methylated by DOT1L. Unlike other HKMTs, DOT1L does not contain a SET domain because it is located in a globular histone core that is difficult for H3K79 to access (Jones et al., 2008) (Table 2).
Aberrant histone modifications can affect chromosomal segregation or abnormal regulation of oncogenes and/or tumor suppressor genes (Table 2). Histone modification, which is frequently observed in cancer cells, is the loss of HDAC-mediated acetylation (Li and Seto, 2016). SIRT1 and deacetylase activities are upregulated in various tumor types. In addition to the loss of acetylation, gene silencing due to increased H3K27me3 due to EZH2 overexpression has been implicated in the progression of several solid malignancies, such as breast cancer (Kleer et al., 2003). Overexpression of lncRNA HOTAIR in ovarian cancer recruits EZH2 and alters the H3K27me3 landscape (Dai et al., 2021). It has also been reported that histone demethylase lysine-specific demethylase 4 A promotes the progression of nasopharyngeal carcinoma by promoting hypoxia-inducible factor-1α expression (Zhao et al., 2021). Aberrant histone modifications are closely related to cancer, and histone-modifying enzyme inhibitors such as HDAC inhibitors are being actively studied and developed.
DNA methylation was first reported in 1948, and its role in gene regulation was first suggested in the mid-1970s (Holliday and Pugh, 1975; Hotchkiss, 1948). In mammalian DNA, there are many 5-methylcytosine (5mC) in sequences where cytosine and guanine are connected in the 5' to 3' direction (Coulondre et al., 1978). Most DNA methylations are stable and play important roles during development and the cell cycle in several epigenetic processes, such as genomic imprinting and X-chromosome inactivation (Smith and Meissner, 2013). DNA methylation plays an important role in controlling the epigenetic environment during reprogramming of somatic cells into pluripotent stem cells (Lee et al., 2014), which is mediated by DNA methyltransferase enzymes (DNMTs) (Table 2). 5mC is mainly profiled using whole-genome bisulfite sequencing (BS-seq) (Cokus et al., 2008). In addition, techniques such as reduced representation bisulfite sequencing (RRBS) and methylated DNA immunoprecipitation sequencing (MeDIP-seq) have been developed to profile the DNA methylation information (Meissner et al., 2005; Weber et al., 2005) (Table 1).
In the mammalian genome, CpG islands are DNA sequences approximately 1000 base pairs in length with a higher CpG density than the rest of the genome (Bird et al., 1985). Most gene promoters (more than 2/3), particularly promoters of housekeeping genes, are embedded in CpG islands (Saxonov et al., 2006). In general, CpG sites in promoter regions within CpG islands are not methylated differently from the other CpG sites (Bird et al., 1985). Silencing of gene expression by methylation of promoter DNA is stable and long-lasting, unlike transcriptional repression by histone modifications (Mohn et al., 2008). Because of these properties, methylation of CpG islands is an important epigenetic mechanism that regulates imprinted genes and gene expression during development and differentiation. Unlike promoter methylation, gene body DNA methylation is positively correlated with gene expression and is considered a characteristic of transcribed genes in cells (Ball et al., 2009). CpG islands within genes or gene bodies can reveal specific methylation differences between tissues or cancer samples. However, CpG islands in promoter regions show little difference in methylation and specific differences at some distance from the CpG islands (called CpG island shores) (Irizarry et al., 2009).
DNA methylation consists of three stages: de novo DNA methylation, maintenance, and demethylation (Fig. 3). DNMT3A and DNMT3B, known as de novo methyltransferases, consist of three main domains: the Pro-Trp-Trp-Pro (PWWP), ATRX-DNMT3-DNMT3L (ADD), and MTase domains (Okano et al., 1998; 1999). The ADD domain binds to the unmodified K4 residue of the H3 tail, until it binds to the MTase domain and acts as an inhibitor (Guo et al., 2015). When the ADD domain binds to unmodified H3K4, the ADD domain unbinds from the MTase domain and enables DNA methylation (Otani et al., 2009). As the ADD domain is repelled by the increasing number of methyl residues in H3K4, DNA methylation does not normally occur in CpG-rich promoters of H3K4me3 enriched genes (Piunti and Shilatifard, 2016). As transcription proceeds, the PWWP domain binds to H3K36me3 generated by the histone methyltransferase SETD2 to induce DNA methylation in the gene body (Dhayalan et al., 2010). Among de novo methylations, only symmetrical CpG methylation is maintained during DNA replication. This depends on the activity of DNMT1 and UHRF1, an E3 ubiquitin-protein ligase with five conserved domains (Arita et al., 2012). It works together with the TTD and PHD domains of UHRF1 to recognize and bind to H3K9me3, thereby loading DNMT1 onto the newly synthesized DNA substrate (Karg et al., 2017). Therefore, DNMT1 is located on the replication pork where newly synthesized hemimethylated DNA is formed during DNA replication (Leonhardt et al., 1992). DNMT1 is called maintenance DNMT because it maintains the pattern of DNA methylation. DNA methylation can be erased by passive or active demethylation. Passive demethylation leads to replication-dependent dilution of 5mC due to defects in the maintenance methylation mechanism that copies methylation patterns during DNA replication (Howell et al., 2001). Passive demethylation is biologically important for erasing DNA methylation in preimplantation embryos and primordial germ cells. Active DNA demethylation mechanisms have been found to be mediated by Tet1, Tet2, and Tet3 enzymes. The TET protein oxidizes 5mC to produce 5-hydroxymethylcytosine (5hmC) and undergoes demethylation (Ito et al., 2010; Tahiliani et al., 2009). 5hmC can be further oxidized to generate 5fC and 5caC, which can be removed and converted to an unmethylated cytosine by the DNA glycosylase (TDG and SMUG1) and base excision repair pathways, respectively (Maiti and Drohat, 2011; Weber et al., 2016) (Table 2).
DNA methylation and histone modification work together to regulate transcription. Because DNMTs generally suppress gene regions through methylation, they inhibit gene expression by interacting with enzymes that regulate histone modification. DNMT1 and DNMT3a restrict gene expression by binding to SUV39H1, an enzyme that methylates H3K9 (Fuks et al., 2000). In addition, DNMT1 and DNMT3b bind to HDACs and regulate gene expression (Fuks et al., 2000; Geiman et al., 2004). Methyl-binding proteins, such as MeCP2 and UHRF, enhance gene repression by interacting with methylated DNA and histones (Nan et al., 1998). Although the interaction between RNA and DNA modifications is not clearly defined, recent studies have reported that high N6-methyl adenosine (m6A) modifications in esophageal squamous cell carcinoma cells lead to DNA demethylation, altering chromatin accessibility and affecting gene transcription (Deng et al., 2022).
DNA methylation is involved in various diseases such as brain disorders and cancer (Table 2). For example, an autosomal-dominant mutation in the N-terminal regulatory domain of DNMT1 was identified in patients with hereditary sensory and autonomic neuropathy type 1, who presented with dementia, hearing loss, and narcolepsy in adulthood (Klein et al., 2011). DNMT3A is associated with the development of acute myeloid leukemia (AML), and TET2 mutations are considered a common epigenetic marker in several hematological malignancies including AML, chronic myelomonocytic leukemia, and lymphomas {Langemeijer, 2009, Acquired mutations in TET2 are common in myelodysplastic syndromes}(Langemeijer et al., 2009; Spencer et al., 2017). Overexpression of UHRF1 promotes hypermethylation of the promoter of thioredoxin-interacting protein (TXNIP), a tumor suppressor gene, and downregulates TXNIP expression in cervical cancer, contributing to carcinogenesis (Kim et al., 2021). The DNMT3A mutation is a missense mutation of Arginine 882 (R882) in the MTase domain, which interferes with DNMT3A formation and reduces the methylation activity of the enzyme (Russler-Germain et al., 2014). Conversely, TET2 mutations induce hypermethylation of enhancer regions in myeloid malignancies (Ko et al., 2010). It is still unclear how two mutants with opposite functions can achieve similar phenotypes. A recent study on patients with AML demonstrated that mutations in DNMT3A and TET2 can cause irregular DNA methylation patterns and transcriptional expression levels in genes known to be involved in AML pathogenesis (Ponciano-Gómez et al., 2017). In addition, Lee et al. (2023) reported that the regulation of C-Maf-inducing protein methylation mediated by DNMT1 and TET2 was correlated with nonalcoholic fatty liver disease.
In mammalian cells, DNA forms nucleosomes, which are arranged into higher-order chromatin structures that play important roles in regulating the cell cycle, replication, development and gene function. The three-dimensional chromatin structure includes cis-regulatory interactions such as enhancer-promoter interactions and repressive interactions such as lamina-associated domains (LADs) and is mediated by structural elements such as CCCTC-binding factor (CTCF) and cohesin (Huang et al., 2021; Rowley and Corces, 2018; Schoenfelder and Fraser, 2019).
Enhancers, a representative cis-regulatory element, are usually several kilobases different from gene promoters but can be involved in transcriptional regulation through reduced spatial proximity to promoters by forming chromatin looping and folding. Chromatin conformation capture (3C) profiles the interactions between these specific genomic regions (Dekker et al., 2002). A recent technique, Hi-C, allows global quantification of all interactions present in the nucleus. Based on the Hi-C data, it was possible to analyze compartments, which are sets of chromosomal regions with similar, long-range Hi-C contact patterns and self-interacting genomic regions called topologically associated domains (TADs) (Dixon et al., 2012; Lieberman-Aiden et al., 2009). Compartment A is associated with open chromatin, and Compartment B is associated with closed chromatin and is specific to the cell type (Dixon et al., 2012). Previous studies have reported that TADs are well conserved among various cell types and species. However, multiple methods based on Hi-C have been developed at the single-cell level, and recent studies have shown that chromatin contacts, such as TAD structures, vary considerably at the single-cell level (Stevens et al., 2017). In general, the frequency of interactions within TADs is high and the interaction between TADs is generally low. CTCF is an important factor in regulating TAD structure and is involved in the insulation between TADs (Szabo et al., 2020). The disruption of TAD boundaries can affect gene expression and is associated with various diseases and cancers. A recent study reported that the combination of genome profiling and CRISPR-Cas9 genome engineering could identify regions with repetitive changes in the 3D genome structure and predict oncogene activity (Xu et al., 2022). Nuclear lamina and constituent filament proteins, A/B type lamins, are located in the nuclear envelope. LADs are heterochromatic regions that constitute approximately 40% of the genome and are in contact with the nuclear lamina. The LAD region is enriched in H3K9me2/3 and H3K27me3, and a few genes within the LAD are expressed (Guelen et al., 2008; Lund et al., 2014).
Mammalian RNA undergoes numerous post-transcriptional chemical modifications. Although these RNA modifications were discovered in the 1970s, their importance has not yet been highlighted. Over the past decade, several studies have shown that RNA modifications play an important role in regulating gene expression, similar to the epigenetic modifications of DNA and histones (Helm and Motorin, 2017).
m6A is the most abundant endogenous RNA modification in eukaryotes. m6A is precisely controlled by three protein groups: “writers” that install m6A residues along target RNA transcripts, “erasers” that remove modifications at specific sites, and “readers” that recognize modified regions. m6A RNA methylation is catalyzed by “writer” proteins, m6A methyltransferase-like 3 and 14 (METTL3 and METTL14). METTL3 and METTL14 form stable heterodimer complexes (Bokar et al., 1997; Liu et al., 2014). In addition, the METTL3-METTL14 complex interacts with Wilms tumor 1-associated protein to target a wide range of RNA substrates and install m6A (Ping et al., 2014). m6A demethylation is catalyzed by the AlkB family of non-heme Fe(II)/α-KG-dependent dioxygenases. Representative major human AlkB family members include fat mass- and obesity-associated protein (FTO) and alkylation repair homolog protein 5 (ALKBH5). FTO is an RNA demethylase first discovered in mammalian cells that removes m6A by generating N6-hydroxymethyladenosine and N6-formyladenosine, which can be hydrolyzed into adenine through continuous oxidation over several hours (Fu et al., 2013). ALKBH5 catalyzes the m6A-to-A conversion by directly removing the methyl group (Jang et al., 2022; Zheng et al., 2013). The YT521-B homology (YTH) domain-containing protein family recognizes and binds to m6A. YTHDC1 is located in the nucleus and is characterized by various RNA splicing regulators, such as serine-arginine repeat proteins. It recognizes m6A on the lncRNA XIST and mediates X chromosome silencing (Patil et al., 2016; Zhang et al., 2010). YTHDF1 and YTHDF3 regulate translation efficiency by interacting with the ribosome of the target RNA, and YTHDF2 promotes the degradation of m6A-modified RNA by recruiting deadenylation enzyme complexes, thereby affecting RNA stability (Du et al., 2016; Li et al., 2017; Wang et al., 2015).
m6A modification is associated with various diseases and cancers. METTL3 promotes the maturation of mir-1246 by methylating pri-mir-1246 and down-regulating the anti-oncogene SPRED2, thereby improving the tumor metastasis ability of colorectal cancer (Peng et al., 2019). In addition, METTL3 is also associated with prostatitis and Aicardi syndrome, FTO is an oncogene in AML, breast cancer and colorectal cancer, and YTHDF1 and YTHDF2 are associated with pancreatic cancer and breast cancer, respectively (Huang et al., 2019; Liu and Pan, 2015; Pan et al., 2021).
Several techniques have been developed to profile epigenetic regulation (Table 1). However, although these bulk-cell profiling methods showed an average signal for a cell population, it was difficult to represent cell-to-cell variation within tissue, such as gene expression heterogeneity. Single-cell epigenomic analysis has the potential to overcome these limitations and to elucidate gene regulatory mechanisms across diverse cellular environments.
Single-cell ChIP-seq (scChIP-seq) was the first reported technique for profiling histone modifications at the single-cell level. scChIp-seq separates single cells into droplets containing lysis buffer and micrococcal nuclease (MNase), followed by barcoding before immunoprecipitation (Rotem et al., 2015). Since then, several technologies have been developed, such as single-cell chromatin immunocleavage followed by sequencing, chromatin integration labeling sequencing, single-cell cleavage under targets and release using nucleases (scCUT&RUN), and CUT&TAG (Bartosovic et al., 2021; Harada et al., 2019; Kaya-Okur et al., 2019; Ku et al., 2019). Several single-cell studies have confirmed that cell-to-cell variations in histone modifications are correlated with heterogeneity in gene expression. For example, measurements of H3K4me2 levels within mouse embryonic stem cell populations revealed significant variation, which was observed in gene enhancers and transcriptionally repressed genes (Rotem et al., 2015). In addition, integrating single-cell-level H3K4me3 profiling and scRNA-seq in the mouse brain confirmed cellular heterogeneity in oligodendrocyte populations, which appeared homogenous and revealed that they could be dispersed into subpopulations enriched with module-specific genes (Bartosovic et al., 2021).
Single-cell level profiling of DNA methylation has mainly been achieved through bisulfite conversion methods such as single-cell BS-seq (scBS-seq), single-cell RRBS, and post-bisulfite adapter-tagging (Guo et al., 2013; Miura et al., 2012; Smallwood et al., 2014). Bisulfite sequencing is based on the conversion of unmethylated cytosine into uracil. scRRBS generates DNA fragments containing CpG-rich ends by using one or more restriction enzymes, followed by bisulfite sequencing (Guo et al., 2013). scBS-seq was used to perform bisulfite treatment and random priming and extension after cell isolation and lysis (Smallwood et al., 2014). These methods show heterogeneity in DNA methylation in both mouse and human cells. Multi-omics techniques have also been developed to investigate the interaction between DNA methylation heterogeneity, gene expression heterogeneity and other epigenetic data (Table 3). For example, single-cell methylome and transcriptome sequencing (scM&T-seq) have demonstrated a functional link between transcriptional and DNA methylation heterogeneity of gene promoters within specific cell and tissue types in mouse muscle stem cells (Angermueller et al., 2016).
Chromatin accessibility plays an important role in the regulation of gene expression by influencing transcription initiation. Chromatin accessibility can be profiled by assays for transposase-accessible chromatin sequencing (ATAC-seq) and DNase I hypersensitive site sequencing (DNase-seq). These methods measure chromatin accessibility based on enzymatic sensitivity (Buenrostro et al., 2013; Song and Crawford, 2010). Single-cell ATAC-seq (scATAC-seq) and single-cell DNase-seq, similar to other single-cell epigenetic profiling techniques, can identify heterogeneity between cells using the chromatin approach. Cusanovich et al. (2015) appliedcombinatorial indexing to intact nuclei and were able to distinguish between various cell types by profiling thousands of cells. In addition, a recent study performed scATAC-seq and scRNA-seq on DNA and mRNA of the same cell, respectively, and found a positive correlation between gene expression and chromatin accessibility heterogeneity in each cell (Reyes et al., 2019).
Research on 3D genome structure has been greatly improved by bulk cell Hi-C, but cell-to-cell variability of 3D genome features such as TADs or enhancer-promoter contacts could not be accurately reflected. Single-cell Hi-C can be used to perform interaction profiling depending on the proximity ligation region. These techniques based on Hi-C have revealed that chromatin contact varies depending on cell type and developmental status. Using single-cell Hi-C, Nagano et al. (2017) found that mouse embryonic stem cells showed chromosomal condensation in the early G1 phase and extensive reorganization during replication, suggesting that the cell cycle is related to heterogeneity in chromatin contact. In addition, single-nucleus Hi-C removed the previously used biotin-filling step and performed sticky end ligation, revealing that the chromatin structure was uniquely reconstructed during the oocyte-to-zygote transition in mice (Flyamer et al., 2017) (Fig. 4).
Recently, multi-modal omics techniques that can simultaneously profile two or more epigenetic features in a single cell have facilitated the analysis of the correlations between different traits. For example, sn-m3C-seq can simultaneously profile DNA methylation and chromatin conformation at a single-cell level (Lee et al., 2019). In addition, Paired-seq and SNARE-seq can simultaneously profile gene expression and chromatin accessibility information, as well as scM&T-seq and snmC2T-seq can co-profile transcriptome and methylome information (Angermueller et al., 2016; Chen et al., 2019; Luo et al., 2022; Zhu et al., 2019). Many multi-modal omics techniques are still being developed, which will further advance our biological understanding (Table 3).
In this review, several characteristics such as DNA methylation and histone modification, which are representative of epigenetic characteristics, were summarized. Mostly, epigenetics focus on molecular biology approaches. However, even if it is not genetic, DNA or histone modifications and three-dimensional chromatin structure have a great influence on the changes that can occur within individual cells. Since the 1980s, scientific interest in the molecular mechanisms of epigenetic control to understand disease development and treatment has continued to grow. Despite numerous studies, several epigenetic mechanisms and patterns remain unknown. However, by integrating advanced techniques sophisticated algorithms, it is now possible to generate and analyze large volumes of epigenetic data. Single-cell multi-omics-based sequencing techniques that are being actively studied can generate information on two or more epigenetic traits. These epigenetic data can facilitate correlation analysis between different traits. Advances in these technologies and information will provide opportunities to establish novel epigenetic markers and their functions in different types of tissue, development, and disease states.
This work was supported by the 2021 Research Fund of the University of Seoul.
U.K. and D.S.L. wrote the manuscript.
The authors have no potential conflicts of interest to disclose.
Methods for profiling epigenetic traits
Data type | Methods | Reference |
---|---|---|
Histone modification | ChIP-seq | (Barski et al., 2007) |
CUT&TAG | (Kaya-Okur et al., 2019) | |
CUT&RUN | (Skene and Henikoff, 2017) | |
DNA methylation | BS-seq | (Cokus et al., 2008) |
RRBS | (Meissner et al., 2005) | |
MeDIP-seq | (Weber et al., 2005) | |
Chromatin contact | Hi-C | (Lieberman-Aiden et al., 2009) |
ChIA-PET | (Wei et al., 2006) | |
Hi-ChIP | (Mumbach et al., 2016) |
Epigenetic changes and associated proteins
Epigenetic modification | Associated protein | Function | Reference |
---|---|---|---|
Maintenance DNA methylation | DNMT1 | Maintain the pattern of DNA methylation | (Klein et al., 2011) |
De novo DNA methylation | DNMT3A, DNMT3B | Methylation of unmodified DNA | (Spencer et al., 2017) |
DNA demethylation | TET1, TET2, TET3 | Remove methylation | (Ponciano-Gómez et al., 2017) |
Histone acteylation | p300/CBP | Transcription activation | (Benton et al., 2017) |
Histone deacetylation | HDACs | Transcriptional regulation | (Li and Seto, 2016) |
H3K4 methylation | SET1 | Transcription activation | (Zaidi et al., 2013) |
H3K9 methylation | EZH2, G9a/GLP | Transcription repression | (Iacono et al., 2018) |
H3K27 methylation | PRC2 | Transcription repression (H3K27me1 is associated with transcription activiation) | (Huang et al., 2021) |
H3K79 methylation | DOT1L | Transcription activation | (Johnson et al., 2009) |
RNA methylation | METTL3, METTL14, WTAP | Installation of m6A on RNA residue | (Ma et al., 2017) |
YTHDF1/2/3, YTHDC1 | Binding with m6A | (Paris et al., 2019) | |
RNA demethylation | FTO, ALKBH5 | Remove of m6A on RNA residue | (Zhang et al., 2016) |
Multimodal omics techniques
Methods | Profiling features | Reference |
---|---|---|
DR-seq | Genome, transcriptome | (Dey et al., 2015) |
G&T-seq | Genome, transcriptome | (Macaulay et al., 2015) |
scTrio-seq | Genome, transcriptome | (Hou et al., 2016) |
SIDR-seq | Genome, transcriptome | (Han et al., 2018) |
CORTAD-seq | Genome, transcriptome | (Kong et al., 2019) |
TARGET-seq | Genome, transcriptome | (Rodriguez-Meira et al., 2019) |
PEA/STA | Transcriptome, proteome | (Genshaft et al., 2016) |
PLAYR | Transcriptome, proteome | (Frei et al., 2016) |
Abseq | Transcriptome, proteome | (Shahi et al., 2017) |
CITE-seq | Transcriptome, proteome | (Stoeckius et al., 2017) |
REAP-seq | Transcriptome, proteome | (Peterson et al., 2017) |
RAID | Transcriptome, proteome | (Gerlach et al., 2019) |
ECCITE-seq | Transcriptome, proteome | (Mimitou et al., 2019) |
sci-CAR | Chromatin accessibility, transcriptome | (Cao et al., 2018) |
T-ATAC-seq | Chromatin accessibility, transcriptome | (Satpathy et al., 2018) |
scCAT-seq | Chromatin accessibility, transcriptome | (Liu et al., 2019) |
SNARE-seq | Chromatin accessibility, transcriptome | (Chen et al., 2019) |
ATAC-RNA-seq | Chromatin accessibility, transcriptome | (Reyes et al., 2019) |
Paired-seq | Chromatin accessibility, transcriptome | (Zhu et al., 2019) |
scMT-seq | DNA methylome, transcriptome | (Hu et al., 2016) |
scM&T-seq | DNA methylome, transcriptome | (Angermueller et al., 2016) |
sc-GEM | DNA metyhlome, transcriptome | (Cheow et al., 2016) |
scNMT-seq | Chromatin accessibility, DNA methylome, transcriptome | (Clark et al., 2018) |
scChaRM-seq | Chromatin accessibility, DNA methylome, transcriptome | (Yan et al., 2021) |
scNOMeRe-seq | Chromatin accessibility, DNA methylome, transcriptome | (Wang et al., 2021) |
snmCAT-seq | Chromatin accessibility, DNA methylome, transcriptome | (Luo et al., 2022) |
DART-seq | Transcriptome, RNA methylome | (Meyer, 2019) |
ORCA | Transcriptome, chromatin conformation | (Mateo et al., 2019) |
. Methods for profiling epigenetic traits.
Data type | Methods | Reference |
---|---|---|
Histone modification | ChIP-seq | (Barski et al., 2007) |
CUT&TAG | (Kaya-Okur et al., 2019) | |
CUT&RUN | (Skene and Henikoff, 2017) | |
DNA methylation | BS-seq | (Cokus et al., 2008) |
RRBS | (Meissner et al., 2005) | |
MeDIP-seq | (Weber et al., 2005) | |
Chromatin contact | Hi-C | (Lieberman-Aiden et al., 2009) |
ChIA-PET | (Wei et al., 2006) | |
Hi-ChIP | (Mumbach et al., 2016) |
. Epigenetic changes and associated proteins.
Epigenetic modification | Associated protein | Function | Reference |
---|---|---|---|
Maintenance DNA methylation | DNMT1 | Maintain the pattern of DNA methylation | (Klein et al., 2011) |
De novo DNA methylation | DNMT3A, DNMT3B | Methylation of unmodified DNA | (Spencer et al., 2017) |
DNA demethylation | TET1, TET2, TET3 | Remove methylation | (Ponciano-Gómez et al., 2017) |
Histone acteylation | p300/CBP | Transcription activation | (Benton et al., 2017) |
Histone deacetylation | HDACs | Transcriptional regulation | (Li and Seto, 2016) |
H3K4 methylation | SET1 | Transcription activation | (Zaidi et al., 2013) |
H3K9 methylation | EZH2, G9a/GLP | Transcription repression | (Iacono et al., 2018) |
H3K27 methylation | PRC2 | Transcription repression (H3K27me1 is associated with transcription activiation) | (Huang et al., 2021) |
H3K79 methylation | DOT1L | Transcription activation | (Johnson et al., 2009) |
RNA methylation | METTL3, METTL14, WTAP | Installation of m6A on RNA residue | (Ma et al., 2017) |
YTHDF1/2/3, YTHDC1 | Binding with m6A | (Paris et al., 2019) | |
RNA demethylation | FTO, ALKBH5 | Remove of m6A on RNA residue | (Zhang et al., 2016) |
. Multimodal omics techniques.
Methods | Profiling features | Reference |
---|---|---|
DR-seq | Genome, transcriptome | (Dey et al., 2015) |
G&T-seq | Genome, transcriptome | (Macaulay et al., 2015) |
scTrio-seq | Genome, transcriptome | (Hou et al., 2016) |
SIDR-seq | Genome, transcriptome | (Han et al., 2018) |
CORTAD-seq | Genome, transcriptome | (Kong et al., 2019) |
TARGET-seq | Genome, transcriptome | (Rodriguez-Meira et al., 2019) |
PEA/STA | Transcriptome, proteome | (Genshaft et al., 2016) |
PLAYR | Transcriptome, proteome | (Frei et al., 2016) |
Abseq | Transcriptome, proteome | (Shahi et al., 2017) |
CITE-seq | Transcriptome, proteome | (Stoeckius et al., 2017) |
REAP-seq | Transcriptome, proteome | (Peterson et al., 2017) |
RAID | Transcriptome, proteome | (Gerlach et al., 2019) |
ECCITE-seq | Transcriptome, proteome | (Mimitou et al., 2019) |
sci-CAR | Chromatin accessibility, transcriptome | (Cao et al., 2018) |
T-ATAC-seq | Chromatin accessibility, transcriptome | (Satpathy et al., 2018) |
scCAT-seq | Chromatin accessibility, transcriptome | (Liu et al., 2019) |
SNARE-seq | Chromatin accessibility, transcriptome | (Chen et al., 2019) |
ATAC-RNA-seq | Chromatin accessibility, transcriptome | (Reyes et al., 2019) |
Paired-seq | Chromatin accessibility, transcriptome | (Zhu et al., 2019) |
scMT-seq | DNA methylome, transcriptome | (Hu et al., 2016) |
scM&T-seq | DNA methylome, transcriptome | (Angermueller et al., 2016) |
sc-GEM | DNA metyhlome, transcriptome | (Cheow et al., 2016) |
scNMT-seq | Chromatin accessibility, DNA methylome, transcriptome | (Clark et al., 2018) |
scChaRM-seq | Chromatin accessibility, DNA methylome, transcriptome | (Yan et al., 2021) |
scNOMeRe-seq | Chromatin accessibility, DNA methylome, transcriptome | (Wang et al., 2021) |
snmCAT-seq | Chromatin accessibility, DNA methylome, transcriptome | (Luo et al., 2022) |
DART-seq | Transcriptome, RNA methylome | (Meyer, 2019) |
ORCA | Transcriptome, chromatin conformation | (Mateo et al., 2019) |
Haejeong Heo, Hee-Jin Kim, Keeok Haam, Hyun Ahm Sohn, Yang-Ji Shin, Hanyong Go, Hyo-Jung Jung, Jong-Hwan Kim, Sang-Il Lee, Kyu-Sang Song, Min-Ju Kim, Haeseung Lee, Eun-Soo Kwon, Seon-Young Kim, Yong Sung Kim, and Mirang Kim
Mol. Cells 2023; 46(5): 298-308 https://doi.org/10.14348/molcells.2023.2148Sangrea Shim, Hong Gil Lee, and Pil Joon Seo
Mol. Cells 2021; 44(10): 746-757 https://doi.org/10.14348/molcells.2021.0160Hyunjin Yoo, Kyunghyuk Park, Jaehoon Lee, Seunga Lee, and Yeonhee Choi
Mol. Cells 2021; 44(8): 602-612 https://doi.org/10.14348/molcells.2021.0084