Mol. Cells 2020; 43(12): 1011-1022
Published online December 9, 2020
https://doi.org/10.14348/molcells.2020.0207
© The Korean Society for Molecular and Cellular Biology
Correspondence to : LKKIM@yuhs.ac (LKK); yjkim@yonsei.ac.kr (YJK)
Cell type specification is a delicate biological event in which every step is under tight regulation. From a molecular point of view, cell fate commitment begins with chromatin alteration, which kickstarts lineage-determining factors to initiate a series of genes required for cell specification. Several important neuronal differentiation factors have been identified from ectopic over-expression studies. However, there is scarce information on which DNA regions are modified during induced pluripotent stem cell (iPSC) to neuronal progenitor cell (NPC) differentiation, the cis regulatory factors that attach to these accessible regions, or the genes that are initially expressed. In this study, we identified the DNA accessible regions of iPSCs and NPCs via the Assay for Transposase-Accessible Chromatin sequencing (ATACseq). We identified which chromatin regions were modified after neuronal differentiation and found that the enhancer regions had more active histone modification changes than the promoters. Through motif enrichment analysis, we found that NEUROD1 controls iPSC differentiation to NPC by binding to the accessible regions of enhancers in cooperation with other factors such as the Hox proteins. Finally, by using Hi-C data, we categorized the genes that directly interacted with the enhancers under the control of NEUROD1 during iPSC to NPC differentiation.
Keywords chromatin accessibility, Hi-C, induced pluripotent stem cell, NEUROD1, neuronal progenitor cel
Human induced pluripotent stem cells (hiPSCs) have been heavily utilized as experimental platforms by various research areas studying lineage differentiation and modelling of human developmental process and diseases (Cotovio and Fernandes, 2020; Rowe and Daley, 2019; Shum et al., 2015). The need for iPSC is especially high in disease modelling given the difficulties in isolating cells from patients and obtaining the optimal number of cells for experiments. Among different cell lineages from iPSC, neuronal lineage which can be matured to functional neuron is of special clinical interest, and iPSC-derived neuronal differentiation models are already in use in analysis of neurodegenerative disease such as Alzheimer’s disease, Parkinson’s disease, Huntington’s disease and motor neuron diseases (Chang et al., 2020; Wu et al., 2019). Although iPSC-derived NPC model has been well studied in terms of defining cellular phenotype and neuronal function according to gene expression, the mechanism of gene regulation is still elusive (Kang et al., 2017; Zhang et al., 2013).
Gene expression is tightly regulated during cell fate commitment (Rue and Martinez Arias, 2015), and one extensively studied layer of regulation is epigenetic modification such as histone and DNA methylation, which regulates the nuclear environments to control gene transcription (Cedar and Bergman, 2009). Other than direct modification, chromatin accessibility to nuclear macromolecules such as transcription factors (TFs) itself is another facet in gene regulation (Klemm et al., 2019). Highly accessible regions of chromatin are considered to be open and promote gene expression, whereas inaccessible regions are closed and inhibit expression (Cedar and Bergman, 2009). It is therefore important to localize the regions with cell-type specific accessibilities as they are inferred to contain genes that determine cell fate (Scott-Browne et al., 2016). Enhancers, for example, have been identified as regions that hold most chromatin accessibiity changes during neuronal development (Frank et al., 2015).
The enhancer is one of the
In this study, we identified cell type-specific accessible chromatin regions of iPSCs and NPCs by Assay for Transposase-Accessible Chromatin sequencing (ATAC-seq) and also noted the DNA regions that changed during differentiation and found that the accessibility of the enhancer, but not the promoter, was related to active histone modification changes. Through motif enrichment analysis, we found that NEUROD1, Hox proteins, and ONECUT1 may control neuronal differentiation by binding to the accessible regions of the enhancer. Furthermore, we identified the initial genes that are controlled by NEUROD1 during iPSC to NPC differentiation using Hi-C sequencing. Finally, our results can be used as additional resource to aid studies of regulatory mechanism in neuronal development-related genes during iPSC to NPC differentiation by chromatin accessibility changes and TF bindings.
The use of human iPSCs was approved by the Institutional Review Board (IRB) of Yonsei University (permit No. 7001988-201802-BR-119-01E). In this study, we used pre-constructed iPSC cell lines (Cat. No. C-004-5C; Gibco, USA). We cultured iPSC cell lines on Matrigel (BD Biosciences, USA) with Essential 8 medium (Invitrogen, USA) coating (Chen et al., 2011; 2017; Higuchi et al., 2015). To differentiate iPSC to NPC, an embryoid body (EB) was generated through culturing of human iPSCs for 5-6 days on non-adherent petri dishes in Essential 8 (Invitrogen) with additional supplement of 5 μM dorsomorphin (DM; Sigma-Aldrich, USA) and 5 μM SB431542 (Sigma-Aldrich). EBs were then attached to the new culture dish coated with Matrigel (BD Biosciences) with neural induction medium comprised of DMEM/F12 medium (Invitrogen), 1× N2 supplement (Invitrogen), 1× nonessential amino acids (Invitrogen) for 6 days. When neural rosette formation appears in the center of the EBs, NPCs were collected by using Pasteur pipet (Jin et al., 2019; Seo et al., 2015; Shin et al., 2017).
ATAC-seq samples were prepared by described method (Buenrostro et al., 2015). We washed 50,000 cells with cold phosphate-buffered saline (PBS) and resuspended cells by cell lysis buffer (10 mM Tris-HCl [pH 7.4], 10 mM NaCl, 3 mM MgCl2, 0.1% IGEPAL CA-630) and centrifuged cells (10 min, 2,000
Prior to mapping, quality control steps were executed. We assessed reads by ‘FastQC’ for read quality control and adapter removing process was done by ‘Cutadapt’ (cutadapt -a CTGTCTCTTATACACATCT -q 10 --minimum-length 36 -o sample_1_trimmed.fastq -p sample_2_trimmed.fastq sample_1.fastq sample_2.fastq). Processed reads were aligned by ‘Bowtie2’ (bowtie2 --very-sensitive -x
We executed H3K27ac ChIP-seq for profiling enhancers in iPSC and NPC cells. Cells were washed with Dulbecco’s PBS (DPBS) and crosslinked by crosslinking buffer (100 mM NaCl, 0.1 mM EDTA, 5 mM HEPES [pH 8.0], 1% formaldehyde) for 10 min at room temperature. The crosslinking process was halted with 125 mM glycine in room temperature for 5 min with rotation and washed twice with cold DPBS. Crosslinked samples were resuspended in SDS lysis buffer (1% SDS, 50 mM Tris-HCl [pH 8.0], 10 mM EDTA) with protease inhibitor (Roche, Switzerland). After proceeding sonication (S220; Covaris, USA), the sonicated chromatin samples were incubated with H3K27ac antibody (39133; Active Motif, USA) and magnetic bead (10001D; Thermo Fisher Scientific, USA) for 4 h in 4°C with rotation. At the same time, input chromatin was stored from sonicated samples to be used as input control. After incubation, the bead complex was washed by RIPA-α (140 mM NaCl, 1 mM EDTA pH 8, 0.5 mM EGTA pH 8, 1% Triton X-100, 0.1% SDS, 0.1% sodium deoxycholate, 10 mM Tris pH 8) with 4 times. The chromatin immunoprecipitated complex was treated with RNaseA (19101; Qiagen) and executed overnight reverse-crosslinking at 68°C. The chromatin immunoprecipitated DNA was extracted using AMPure XP beads (A63881; Beckman Coulter), and libraries were prepared using NEB Next Ultra II DNA library Prep Kit (E7645; NEB, USA) following the manufacturer’s instruction. The ChIP-seq libraries were sequenced in 150 bp paired-end mode using Illumina NovaSeq 6000 platform.
We utilized H3K4me3 and H3K27me3 ChIP-seq data for iPSC and NPC in Gene Expression Omnibus (GEO) (GSE156723). For read alignment, we used BWA (0.7.7) to map reads to the human genome sequence (hg19). We further removed duplicated reads using Picard MarkDuplicate (1.141). We further inspected the quality of the ChIP-seq experiment by measuring FRiP (fraction of reads in peaks) using ChIPQC (3.3). For peak calling, we used MACS2 (2.1.0) to call the significantly enriched ChIP region compare to the input IgG control “q -0.01 (FDR < 0.01%)”. For the ATAC peak intensity comparison, we generated the normalized RPKM values by counting the number of reads for 50 bp bins then dividing by the total sequencing depth.
To identify specifically enriched motifs in ATAC peak groups, we used the HOMER algorithm with default options (Heinz et al., 2010). We analyzed the known motifs results derived from published data which are provided by HOMER algorithm.
We utilized RNA-seq data for iPSC and NPC in GEO (GSE156723). Sequenced reads were mapped and aligned to the hg19 reference using Tophat2 (v2.1.0) and STAR (v2.6.0a) with the support of GRCh37 reference. From the mapped reads, we executed calculation of TPM (transcripts per million) using RSEM (v1.2.31) at the gene level. For data visualization browser, bamCoverage (deepTools 3.5.0) was used with default options.
Hi-C data processing is executed by described method in Yang et al. (2018). Raw reads were aligned to hg19 human reference genome. To focus on cis-chromosomal interactions, invalid Hi-C reads were removed by criteria which defined in previous studies; inter-chromosomal interactions, putative self-ligations, and non-restriction site specific ligations (Dixon et al., 2015; Schmitt et al., 2016). Valid cis-interactions were collected to build raw interaction frequency matrix with 5 kb resolution. A negative binomial model-based normalization pipeline removed biases of the interaction frequency matrix. Finally, distance dependent background signal was used to normalization for emphasizing biologically significant long-range chromatin interactions.
The NGS sequence data generated in this paper have been deposited in the GEO database, www.ncbi.nlm.nih.gov/geo (accession No. GSE158382).
Before producing ATAC-seq data for iPSC to NPC, we confirmed our differentiation system by detecting pluripotency markers for iPSC (Oct4 and E-cadherin) and neuronal markers for NPC (Sox2 and Nestin) (Fig. 1A). To identify DNA regions that become accessible during neuronal differentiation, we performed ATAC-seq on both iPSCs and NPCs. We made biological replicates of the samples for ATAC-seq, confirmed their reproducibility, and proceeded to integrate them for further analysis (Fig. 1B). At first, we defined ATAC peaks in each cell type using MACS2 with specific options (Zhang et al., 2008). By using ChIPseeker with default settings (Yu et al., 2015), we observed that most of the ATAC peaks were found in the promoters and distal intergenic regions (Fig. 1C). Next, we categorized ATAC peaks into three groups depending on the existence of peaks in iPSC and NPC: common to both samples, iPSC-specific, or NPC-specific. We executed GREAT analysis for each ATAC peak groups with whole genome background (McLean et al., 2010) and checked the genomic positions of each group; 48.1% of Group C were observed within 5 kb from the transcription start sites (TSS), but in contrast, only 7.9% of Group I and 16.3% of Group N were located near the TSS regions (Fig. 1D). Interestingly, about 93% of Group I and about 83% of Group N were located at a distance over 5 kb from the TSS, suggesting that cell type-specific chromatin accessible regions may contain distal-acting regulatory elements (Fig. 1D). For ATAC peak grouping, we first used Deseq2 for differential peak calling. However, we found that the results could not reflect closed chromatin states (data not shown). So, we utilized the strategy of re-calling peaks from a previous study (Li et al., 2017). When we examined the RPKM (Reads Per Kilobase per Million mapped reads) of iPSC and NPC states in iPSC-specific peaks, we found that there was a clear separation point (1.75) between them. Thus, we set the peak intensity of 1.75 as our standard for the background signal (Fig. 1E). The NPC-specific peaks also showed the same crossing point as the iPSC-specific peaks. Based on this standard, we refined the following three groups: commonly accessible regions (ATAC peak intensity > 1.75 in both iPSC and NPC: Group C), iPSC-specific accessible regions (ATAC peak intensity > 1.75 in iPSC and < in NPC: Group I), and NPC-specific accessible regions (ATAC peak intensity > 1.75 in NPC and < in iPSC: Group N).
We next sought to determine which genes were linked to our defined groups. We performed hierarchical clustering analysis for the three ATAC peak groups and then examined which genes existed within +/- 5 kb from the ATAC peak regions (Fig. 2A). The genes in Group C were mainly house-keeping genes such as
To further examine the features of each ATAC peak group, we decided to look into any histone modification changes corresponding to DNA accessibility during differentiation into NPC. To do this, we utilized three histone ChIP-seq data sets: H3K4me3 (active histone marker), H3K27me3 (repressive histone marker), and H3K27ac (active histone and enhancer marker). Group I showed a high level of H3K27me3 and low levels of H3K4me3 and H3K27ac compared with Group C (Fig. 3A). By contrast, Group N showed a lower level of H3K27me3 and elevated levels of H3K4me3 and H3K27ac, suggesting that changes in chromatin accessibility were positively linked to levels of H3K4me3 and H3K27ac and negatively to H3K27me3. For further analysis, we sub-categorized the groups into either promoter or enhancer. The promoter group was determined as the regions located within +/- 2 kb from the TSS, and the enhancer group was defined from the H3K27ac peak data with ENCODE candidate Regulatory Elements (cRE) data. We calculated the average levels of H3K4me3, H3K27me3, and H3K27ac respectively, and then profiled them in the regions of +/- 500 bp from the ATAC peak summit (Fig. 3B). During iPSC to NPC differentiation, the level of H3K4me3 in the promoter was decreased and the level of H3K27me3 was increased in the iPSC-specific peak (Group I), as expected. Surprisingly, no change was observed for H3K4me3 and H3K27me3 in Group N, even after differentiation into NPC. However, in the enhancer, the level of H3K27ac was relatively low in the iPSC-specific peak (Group I) and increased in the NPC-specific peak (Group N), corresponding with the differentiation process. Taken together, we concluded that the changes to chromatin accessibility have more effect on the activity of the enhancers rather than promoters.
It was reported that the cellular differentiation processes are regulated by specific TFs that initiate chromatin structure changes via cell type specific enhancers (Lupien et al., 2008). We performed TF motif enrichment analysis targeting our cell type specific ATAC peaks in the enhancer regions by HOMER analysis (Heinz et al., 2010). At first, we sought to find TFs that satisfy our criteria of
To characterize these five groups of TFs, we sought to determine which genes were connected to them by utilizing Hi-C data to analyze the physical chromatin interactions. We made an intra-chromosomal interaction matrix with 5 kb resolution, linked enhancers to target genes, and performed gene ontology analysis (Table 1). As a result, we found that Group 1 (HOX and NEUROD1 diminished) included genes related to organ morphogenesis, cellular differentiation, and the Hippo signaling pathway. Group 2 (HOX dominant) included genes linked to neural tube development and the Wnt signaling pathway, and Group 3 (NEUROD1 with PBX1 and PKNOX1) were linked to the canonical Wnt signaling pathway and cell cycle related genes. Group 4 (NEUROD1 and ONECUT1) included genes related to neural crest cell migration, neural tube closure, and axon guidance. Lastly, Group 5 (NEUROD1 dominant) were linked to genes that promote cell proliferation, except keratinocytes, and promote the Notch signaling pathway (Table 1). Interestingly, Group 2 and Group 4 were related to neuronal development terms; however, overall the genes from Groups 1 to 5 were required for intrinsic NPC development from iPSCs.
Profiling chromatin accessibility changes during differentiation provides an insight into the regulatory mechanism of lineage-specific genes. These chromatin accessibility changes are mediated by a certain combination of TFs, so-called pioneer factors (Frank et al., 2015; Schaffner, 2015; Scott-Browne et al., 2016). The Yamanaka factors, which consist of OCT4, SOX2, NANOG, and MYC, are the most well-recognized pioneer factors that convert fibroblasts into stem cells (Takahashi and Yamanaka, 2006). Pioneer factors involved in several lineages have been discovered as well, including GATA4, MEF2C, and TBX5 which are able to induce direct reprogramming of fibroblasts into cardiomyocyte-like cells (Qian et al., 2013). Pioneer factors related to neuronal differentiation are also termed proneural factors, which mainly have bHLH structures (Baker and Brown, 2018; Bertrand et al., 2002). It is known that proneural factors, such as the BAM factors, NEUROD1, and NEUROG2, are able to bind nucleosomal chromatin and induce chromatin opening in fibroblasts during differentiation into neuronal cells (Soufi et al., 2012; Wapinski et al., 2013). The degree to which such TFs are able to bind to chromatin is reflected in chromatin accessibility, which then extrapolates to provide researchers with snapshots of dynamic regulatory capacity of the genome.
We used ATAC-seq, a widely adopted method given its robustness and relative ease (Klemm et al., 2019) to compare chromatin accessbility between iPSCs and its differentiated counterpart NPCs. Through analysis of chromatin status and histone modification data (H3K4me3, H3K27me3, H3K27ac), we showed that there is higher correlation between chromatin accessibility and activation of enhancers rather than with promoters. We then observed that the NEUROD1 binding sites were highly enriched in NPC-specific accessible regions in enhancers. In addition to NEUROD1, other TF motifs were identified such as the HOX family, PBX3, PHOX2, and ONECUT1, which are all involved in neuronal development. Generally, the HOX genes have conserved roles in body segmentation with dynamic expression patterns in response to cellular environment (Krumlauf, 1994; Philippidou and Dasen, 2013). The HOX 1 to 5 genes are highly expressed in the hindbrain, while HOX 4 to 11 are predominantly expressed in the spinal cord (Philippidou and Dasen, 2013). PHOX2 is known to play a role in neuronal development (Brunet and Pattyn, 2002; Dubreuil et al., 2002). PBX and PKNOX1 controls the localization each other, and PBX3 is highly expressed in the nervous system development (Ferretti et al., 2006; Rhee et al., 2004). In addition to the homeobox genes, the zinc finger proteins also play crucial roles in neuronal development (Lamar et al., 2001; Qin et al., 2009; Shimizu et al., 2010). It is known that NR4A1, a member of the nuclear receptor family of TFs, has a role in regulating neuronal development in the central nervous system, and together with ONECUT1/2, is highly expressed in the developing mouse retina (Akiyama et al., 2008; Honkaniemi and Sharp, 1999; Wu et al., 2012). Furthermore, PAX3 and PAX6 which belong to the paired box gene family, have distinctive roles in early neuronal differentiation and the developing spinal cord (Bel-Vialar et al., 2007; Terzic and Saraga-Babic, 1999). Finally, based on physical chromatin interactions with the above TFs, we were able to identify the first line of genes involved in neuronal lineage specification. Our findings carry significance—while various studies involving iPSC-derived neuronal disease models have suggested causal relationship between non-coding disease variants and defect of regulatory elements, giving rise to dysregulated TF binding to neuronal cell-type specific accessible regions in neuronal diseases such as Alzheimer’s disease and schizophrenia (Forrest et al., 2017; Schwartzentruber et al., 2018; Zhang et al., 2020)—there still remains a lack of information on the downstream genes that are affected by defective regulatory elements. Cell fate specification consists of multiple steps, necessitating precise control of gene regulation (Frank et al., 2015) and we have provided a framework that is able to accurately identify genes involved in early neuronal differentiation from iPSCs to NPCs through the integrated analysis of chromatin accessibility, histone modification and chromatin spatial organization.
While the role of NEUROD1 in inducing changes to chromatin accessibility during direct reprogramming process (Matsuda et al., 2019) has been recognized, other factors that cooperate with NEUROD1 are still unknown, particularly during the intrinsic developmental process from iPSC into NPC. To the best of our knowedege, our study is the first report to fully lineate the following: 1) localization of chromatin sites that are first opened, 2) corresponding TFs that bind to these sites, and 3) initial genes under regulation of the TFs during neuronal differentiation. Although there remains a need to characterize expression dynamics of NeuroD1 during iPSC to NPC differentiation, this study sheds light on the understanding of normal early neuronal development and holds potential to be used as a resource in comparing with neuronal disease model using iPSC.
This work was supported by Samsung Science and Technology Foundation Project (SSTF-BA1601-13).
W.Y.C. and Y.J.K. conceived and designed the study. W.Y.C. and A.J.L. produced and processed next generation sequencing data and W.Y.C. analyzed the data. A.N.C. performed cellular work for iPSC maintaining and differentiation. I.J. and S.W.C. contributed advice in experiment and data analysis. W.Y.C., J.H.H., and L.K.K. wrote the manuscript. L.K.K. and Y.J.K. supervised the project.
The authors have no potential conflicts of interest to disclose.
Target genes linked with enhancers that are associated with NPC-specific ATAC peaks
Enhancer-TF group | Genes | ||
---|---|---|---|
Group 1 | NEUROD1 and HOX diminished | ||
Organ morphogenesis | 1.80E-02 | ||
Cell differentiation | 1.50E-02 | ||
Hippo signaling pathway | 1.90E-03 | ||
Group 2 | HOX dominant | ||
Neural tube development | 1.60E-03 | ||
Wnt signaling | 2.20E-02 | ||
Group 3 | NEUROD1 with PBX3 and PKNOX1 | ||
Canonical Wnt signaling pathway | 2.90E-02 | ||
Cell cycle | 3.30E-02 | ||
Group 4 | NEUROD1 with ONECUT1 | ||
Neural crest cell migration | 5.80E-04 | ||
Neural tube closure | 9.10E-03 | ||
Axon guidance | 3.30E-02 | ||
Group 5 | NEUROD1 dominant | ||
Positive regulation of cell proliferation | 2.00E-02 | ||
Negative regulation of keratinocyte proliferation | 3.40E-02 | ||
Positive regulation of Notch signaling pathway | 2.70E-02 |
We linked enhancers and genes by Hi-C interaction analysis with 5 kb resolution. Interactions with a ≥ 2-fold change above the background signal and a > 5 of normalized interaction counts were included in the analysis. P values were calculated using the Benjamini–Hochberg method in DAVID.
Mol. Cells 2020; 43(12): 1011-1022
Published online December 31, 2020 https://doi.org/10.14348/molcells.2020.0207
Copyright © The Korean Society for Molecular and Cellular Biology.
Won-Young Choi1 , Ji-Hyun Hwang1
, Ann-Na Cho2
, Andrew J. Lee3
, Inkyung Jung3
, Seung-Woo Cho2
, Lark Kyun Kim4,*
, and Young-Joon Kim5,*
1Interdisciplinary Program of Integrated OMICS for Biomedical Science, The Graduate School, Yonsei University, Seoul 03722, Korea, 2Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea, 3Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea,4Severance Biomedical Science Institute and BK21 PLUS Project for Medical Sciences, Gangnam Severance Hospital, Yonsei University College of Medicine, Seoul 06273, Korea, 5Department of Biochemistry, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
Correspondence to:LKKIM@yuhs.ac (LKK); yjkim@yonsei.ac.kr (YJK)
Cell type specification is a delicate biological event in which every step is under tight regulation. From a molecular point of view, cell fate commitment begins with chromatin alteration, which kickstarts lineage-determining factors to initiate a series of genes required for cell specification. Several important neuronal differentiation factors have been identified from ectopic over-expression studies. However, there is scarce information on which DNA regions are modified during induced pluripotent stem cell (iPSC) to neuronal progenitor cell (NPC) differentiation, the cis regulatory factors that attach to these accessible regions, or the genes that are initially expressed. In this study, we identified the DNA accessible regions of iPSCs and NPCs via the Assay for Transposase-Accessible Chromatin sequencing (ATACseq). We identified which chromatin regions were modified after neuronal differentiation and found that the enhancer regions had more active histone modification changes than the promoters. Through motif enrichment analysis, we found that NEUROD1 controls iPSC differentiation to NPC by binding to the accessible regions of enhancers in cooperation with other factors such as the Hox proteins. Finally, by using Hi-C data, we categorized the genes that directly interacted with the enhancers under the control of NEUROD1 during iPSC to NPC differentiation.
Keywords: chromatin accessibility, Hi-C, induced pluripotent stem cell, NEUROD1, neuronal progenitor cel
Human induced pluripotent stem cells (hiPSCs) have been heavily utilized as experimental platforms by various research areas studying lineage differentiation and modelling of human developmental process and diseases (Cotovio and Fernandes, 2020; Rowe and Daley, 2019; Shum et al., 2015). The need for iPSC is especially high in disease modelling given the difficulties in isolating cells from patients and obtaining the optimal number of cells for experiments. Among different cell lineages from iPSC, neuronal lineage which can be matured to functional neuron is of special clinical interest, and iPSC-derived neuronal differentiation models are already in use in analysis of neurodegenerative disease such as Alzheimer’s disease, Parkinson’s disease, Huntington’s disease and motor neuron diseases (Chang et al., 2020; Wu et al., 2019). Although iPSC-derived NPC model has been well studied in terms of defining cellular phenotype and neuronal function according to gene expression, the mechanism of gene regulation is still elusive (Kang et al., 2017; Zhang et al., 2013).
Gene expression is tightly regulated during cell fate commitment (Rue and Martinez Arias, 2015), and one extensively studied layer of regulation is epigenetic modification such as histone and DNA methylation, which regulates the nuclear environments to control gene transcription (Cedar and Bergman, 2009). Other than direct modification, chromatin accessibility to nuclear macromolecules such as transcription factors (TFs) itself is another facet in gene regulation (Klemm et al., 2019). Highly accessible regions of chromatin are considered to be open and promote gene expression, whereas inaccessible regions are closed and inhibit expression (Cedar and Bergman, 2009). It is therefore important to localize the regions with cell-type specific accessibilities as they are inferred to contain genes that determine cell fate (Scott-Browne et al., 2016). Enhancers, for example, have been identified as regions that hold most chromatin accessibiity changes during neuronal development (Frank et al., 2015).
The enhancer is one of the
In this study, we identified cell type-specific accessible chromatin regions of iPSCs and NPCs by Assay for Transposase-Accessible Chromatin sequencing (ATAC-seq) and also noted the DNA regions that changed during differentiation and found that the accessibility of the enhancer, but not the promoter, was related to active histone modification changes. Through motif enrichment analysis, we found that NEUROD1, Hox proteins, and ONECUT1 may control neuronal differentiation by binding to the accessible regions of the enhancer. Furthermore, we identified the initial genes that are controlled by NEUROD1 during iPSC to NPC differentiation using Hi-C sequencing. Finally, our results can be used as additional resource to aid studies of regulatory mechanism in neuronal development-related genes during iPSC to NPC differentiation by chromatin accessibility changes and TF bindings.
The use of human iPSCs was approved by the Institutional Review Board (IRB) of Yonsei University (permit No. 7001988-201802-BR-119-01E). In this study, we used pre-constructed iPSC cell lines (Cat. No. C-004-5C; Gibco, USA). We cultured iPSC cell lines on Matrigel (BD Biosciences, USA) with Essential 8 medium (Invitrogen, USA) coating (Chen et al., 2011; 2017; Higuchi et al., 2015). To differentiate iPSC to NPC, an embryoid body (EB) was generated through culturing of human iPSCs for 5-6 days on non-adherent petri dishes in Essential 8 (Invitrogen) with additional supplement of 5 μM dorsomorphin (DM; Sigma-Aldrich, USA) and 5 μM SB431542 (Sigma-Aldrich). EBs were then attached to the new culture dish coated with Matrigel (BD Biosciences) with neural induction medium comprised of DMEM/F12 medium (Invitrogen), 1× N2 supplement (Invitrogen), 1× nonessential amino acids (Invitrogen) for 6 days. When neural rosette formation appears in the center of the EBs, NPCs were collected by using Pasteur pipet (Jin et al., 2019; Seo et al., 2015; Shin et al., 2017).
ATAC-seq samples were prepared by described method (Buenrostro et al., 2015). We washed 50,000 cells with cold phosphate-buffered saline (PBS) and resuspended cells by cell lysis buffer (10 mM Tris-HCl [pH 7.4], 10 mM NaCl, 3 mM MgCl2, 0.1% IGEPAL CA-630) and centrifuged cells (10 min, 2,000
Prior to mapping, quality control steps were executed. We assessed reads by ‘FastQC’ for read quality control and adapter removing process was done by ‘Cutadapt’ (cutadapt -a CTGTCTCTTATACACATCT -q 10 --minimum-length 36 -o sample_1_trimmed.fastq -p sample_2_trimmed.fastq sample_1.fastq sample_2.fastq). Processed reads were aligned by ‘Bowtie2’ (bowtie2 --very-sensitive -x
We executed H3K27ac ChIP-seq for profiling enhancers in iPSC and NPC cells. Cells were washed with Dulbecco’s PBS (DPBS) and crosslinked by crosslinking buffer (100 mM NaCl, 0.1 mM EDTA, 5 mM HEPES [pH 8.0], 1% formaldehyde) for 10 min at room temperature. The crosslinking process was halted with 125 mM glycine in room temperature for 5 min with rotation and washed twice with cold DPBS. Crosslinked samples were resuspended in SDS lysis buffer (1% SDS, 50 mM Tris-HCl [pH 8.0], 10 mM EDTA) with protease inhibitor (Roche, Switzerland). After proceeding sonication (S220; Covaris, USA), the sonicated chromatin samples were incubated with H3K27ac antibody (39133; Active Motif, USA) and magnetic bead (10001D; Thermo Fisher Scientific, USA) for 4 h in 4°C with rotation. At the same time, input chromatin was stored from sonicated samples to be used as input control. After incubation, the bead complex was washed by RIPA-α (140 mM NaCl, 1 mM EDTA pH 8, 0.5 mM EGTA pH 8, 1% Triton X-100, 0.1% SDS, 0.1% sodium deoxycholate, 10 mM Tris pH 8) with 4 times. The chromatin immunoprecipitated complex was treated with RNaseA (19101; Qiagen) and executed overnight reverse-crosslinking at 68°C. The chromatin immunoprecipitated DNA was extracted using AMPure XP beads (A63881; Beckman Coulter), and libraries were prepared using NEB Next Ultra II DNA library Prep Kit (E7645; NEB, USA) following the manufacturer’s instruction. The ChIP-seq libraries were sequenced in 150 bp paired-end mode using Illumina NovaSeq 6000 platform.
We utilized H3K4me3 and H3K27me3 ChIP-seq data for iPSC and NPC in Gene Expression Omnibus (GEO) (GSE156723). For read alignment, we used BWA (0.7.7) to map reads to the human genome sequence (hg19). We further removed duplicated reads using Picard MarkDuplicate (1.141). We further inspected the quality of the ChIP-seq experiment by measuring FRiP (fraction of reads in peaks) using ChIPQC (3.3). For peak calling, we used MACS2 (2.1.0) to call the significantly enriched ChIP region compare to the input IgG control “q -0.01 (FDR < 0.01%)”. For the ATAC peak intensity comparison, we generated the normalized RPKM values by counting the number of reads for 50 bp bins then dividing by the total sequencing depth.
To identify specifically enriched motifs in ATAC peak groups, we used the HOMER algorithm with default options (Heinz et al., 2010). We analyzed the known motifs results derived from published data which are provided by HOMER algorithm.
We utilized RNA-seq data for iPSC and NPC in GEO (GSE156723). Sequenced reads were mapped and aligned to the hg19 reference using Tophat2 (v2.1.0) and STAR (v2.6.0a) with the support of GRCh37 reference. From the mapped reads, we executed calculation of TPM (transcripts per million) using RSEM (v1.2.31) at the gene level. For data visualization browser, bamCoverage (deepTools 3.5.0) was used with default options.
Hi-C data processing is executed by described method in Yang et al. (2018). Raw reads were aligned to hg19 human reference genome. To focus on cis-chromosomal interactions, invalid Hi-C reads were removed by criteria which defined in previous studies; inter-chromosomal interactions, putative self-ligations, and non-restriction site specific ligations (Dixon et al., 2015; Schmitt et al., 2016). Valid cis-interactions were collected to build raw interaction frequency matrix with 5 kb resolution. A negative binomial model-based normalization pipeline removed biases of the interaction frequency matrix. Finally, distance dependent background signal was used to normalization for emphasizing biologically significant long-range chromatin interactions.
The NGS sequence data generated in this paper have been deposited in the GEO database, www.ncbi.nlm.nih.gov/geo (accession No. GSE158382).
Before producing ATAC-seq data for iPSC to NPC, we confirmed our differentiation system by detecting pluripotency markers for iPSC (Oct4 and E-cadherin) and neuronal markers for NPC (Sox2 and Nestin) (Fig. 1A). To identify DNA regions that become accessible during neuronal differentiation, we performed ATAC-seq on both iPSCs and NPCs. We made biological replicates of the samples for ATAC-seq, confirmed their reproducibility, and proceeded to integrate them for further analysis (Fig. 1B). At first, we defined ATAC peaks in each cell type using MACS2 with specific options (Zhang et al., 2008). By using ChIPseeker with default settings (Yu et al., 2015), we observed that most of the ATAC peaks were found in the promoters and distal intergenic regions (Fig. 1C). Next, we categorized ATAC peaks into three groups depending on the existence of peaks in iPSC and NPC: common to both samples, iPSC-specific, or NPC-specific. We executed GREAT analysis for each ATAC peak groups with whole genome background (McLean et al., 2010) and checked the genomic positions of each group; 48.1% of Group C were observed within 5 kb from the transcription start sites (TSS), but in contrast, only 7.9% of Group I and 16.3% of Group N were located near the TSS regions (Fig. 1D). Interestingly, about 93% of Group I and about 83% of Group N were located at a distance over 5 kb from the TSS, suggesting that cell type-specific chromatin accessible regions may contain distal-acting regulatory elements (Fig. 1D). For ATAC peak grouping, we first used Deseq2 for differential peak calling. However, we found that the results could not reflect closed chromatin states (data not shown). So, we utilized the strategy of re-calling peaks from a previous study (Li et al., 2017). When we examined the RPKM (Reads Per Kilobase per Million mapped reads) of iPSC and NPC states in iPSC-specific peaks, we found that there was a clear separation point (1.75) between them. Thus, we set the peak intensity of 1.75 as our standard for the background signal (Fig. 1E). The NPC-specific peaks also showed the same crossing point as the iPSC-specific peaks. Based on this standard, we refined the following three groups: commonly accessible regions (ATAC peak intensity > 1.75 in both iPSC and NPC: Group C), iPSC-specific accessible regions (ATAC peak intensity > 1.75 in iPSC and < in NPC: Group I), and NPC-specific accessible regions (ATAC peak intensity > 1.75 in NPC and < in iPSC: Group N).
We next sought to determine which genes were linked to our defined groups. We performed hierarchical clustering analysis for the three ATAC peak groups and then examined which genes existed within +/- 5 kb from the ATAC peak regions (Fig. 2A). The genes in Group C were mainly house-keeping genes such as
To further examine the features of each ATAC peak group, we decided to look into any histone modification changes corresponding to DNA accessibility during differentiation into NPC. To do this, we utilized three histone ChIP-seq data sets: H3K4me3 (active histone marker), H3K27me3 (repressive histone marker), and H3K27ac (active histone and enhancer marker). Group I showed a high level of H3K27me3 and low levels of H3K4me3 and H3K27ac compared with Group C (Fig. 3A). By contrast, Group N showed a lower level of H3K27me3 and elevated levels of H3K4me3 and H3K27ac, suggesting that changes in chromatin accessibility were positively linked to levels of H3K4me3 and H3K27ac and negatively to H3K27me3. For further analysis, we sub-categorized the groups into either promoter or enhancer. The promoter group was determined as the regions located within +/- 2 kb from the TSS, and the enhancer group was defined from the H3K27ac peak data with ENCODE candidate Regulatory Elements (cRE) data. We calculated the average levels of H3K4me3, H3K27me3, and H3K27ac respectively, and then profiled them in the regions of +/- 500 bp from the ATAC peak summit (Fig. 3B). During iPSC to NPC differentiation, the level of H3K4me3 in the promoter was decreased and the level of H3K27me3 was increased in the iPSC-specific peak (Group I), as expected. Surprisingly, no change was observed for H3K4me3 and H3K27me3 in Group N, even after differentiation into NPC. However, in the enhancer, the level of H3K27ac was relatively low in the iPSC-specific peak (Group I) and increased in the NPC-specific peak (Group N), corresponding with the differentiation process. Taken together, we concluded that the changes to chromatin accessibility have more effect on the activity of the enhancers rather than promoters.
It was reported that the cellular differentiation processes are regulated by specific TFs that initiate chromatin structure changes via cell type specific enhancers (Lupien et al., 2008). We performed TF motif enrichment analysis targeting our cell type specific ATAC peaks in the enhancer regions by HOMER analysis (Heinz et al., 2010). At first, we sought to find TFs that satisfy our criteria of
To characterize these five groups of TFs, we sought to determine which genes were connected to them by utilizing Hi-C data to analyze the physical chromatin interactions. We made an intra-chromosomal interaction matrix with 5 kb resolution, linked enhancers to target genes, and performed gene ontology analysis (Table 1). As a result, we found that Group 1 (HOX and NEUROD1 diminished) included genes related to organ morphogenesis, cellular differentiation, and the Hippo signaling pathway. Group 2 (HOX dominant) included genes linked to neural tube development and the Wnt signaling pathway, and Group 3 (NEUROD1 with PBX1 and PKNOX1) were linked to the canonical Wnt signaling pathway and cell cycle related genes. Group 4 (NEUROD1 and ONECUT1) included genes related to neural crest cell migration, neural tube closure, and axon guidance. Lastly, Group 5 (NEUROD1 dominant) were linked to genes that promote cell proliferation, except keratinocytes, and promote the Notch signaling pathway (Table 1). Interestingly, Group 2 and Group 4 were related to neuronal development terms; however, overall the genes from Groups 1 to 5 were required for intrinsic NPC development from iPSCs.
Profiling chromatin accessibility changes during differentiation provides an insight into the regulatory mechanism of lineage-specific genes. These chromatin accessibility changes are mediated by a certain combination of TFs, so-called pioneer factors (Frank et al., 2015; Schaffner, 2015; Scott-Browne et al., 2016). The Yamanaka factors, which consist of OCT4, SOX2, NANOG, and MYC, are the most well-recognized pioneer factors that convert fibroblasts into stem cells (Takahashi and Yamanaka, 2006). Pioneer factors involved in several lineages have been discovered as well, including GATA4, MEF2C, and TBX5 which are able to induce direct reprogramming of fibroblasts into cardiomyocyte-like cells (Qian et al., 2013). Pioneer factors related to neuronal differentiation are also termed proneural factors, which mainly have bHLH structures (Baker and Brown, 2018; Bertrand et al., 2002). It is known that proneural factors, such as the BAM factors, NEUROD1, and NEUROG2, are able to bind nucleosomal chromatin and induce chromatin opening in fibroblasts during differentiation into neuronal cells (Soufi et al., 2012; Wapinski et al., 2013). The degree to which such TFs are able to bind to chromatin is reflected in chromatin accessibility, which then extrapolates to provide researchers with snapshots of dynamic regulatory capacity of the genome.
We used ATAC-seq, a widely adopted method given its robustness and relative ease (Klemm et al., 2019) to compare chromatin accessbility between iPSCs and its differentiated counterpart NPCs. Through analysis of chromatin status and histone modification data (H3K4me3, H3K27me3, H3K27ac), we showed that there is higher correlation between chromatin accessibility and activation of enhancers rather than with promoters. We then observed that the NEUROD1 binding sites were highly enriched in NPC-specific accessible regions in enhancers. In addition to NEUROD1, other TF motifs were identified such as the HOX family, PBX3, PHOX2, and ONECUT1, which are all involved in neuronal development. Generally, the HOX genes have conserved roles in body segmentation with dynamic expression patterns in response to cellular environment (Krumlauf, 1994; Philippidou and Dasen, 2013). The HOX 1 to 5 genes are highly expressed in the hindbrain, while HOX 4 to 11 are predominantly expressed in the spinal cord (Philippidou and Dasen, 2013). PHOX2 is known to play a role in neuronal development (Brunet and Pattyn, 2002; Dubreuil et al., 2002). PBX and PKNOX1 controls the localization each other, and PBX3 is highly expressed in the nervous system development (Ferretti et al., 2006; Rhee et al., 2004). In addition to the homeobox genes, the zinc finger proteins also play crucial roles in neuronal development (Lamar et al., 2001; Qin et al., 2009; Shimizu et al., 2010). It is known that NR4A1, a member of the nuclear receptor family of TFs, has a role in regulating neuronal development in the central nervous system, and together with ONECUT1/2, is highly expressed in the developing mouse retina (Akiyama et al., 2008; Honkaniemi and Sharp, 1999; Wu et al., 2012). Furthermore, PAX3 and PAX6 which belong to the paired box gene family, have distinctive roles in early neuronal differentiation and the developing spinal cord (Bel-Vialar et al., 2007; Terzic and Saraga-Babic, 1999). Finally, based on physical chromatin interactions with the above TFs, we were able to identify the first line of genes involved in neuronal lineage specification. Our findings carry significance—while various studies involving iPSC-derived neuronal disease models have suggested causal relationship between non-coding disease variants and defect of regulatory elements, giving rise to dysregulated TF binding to neuronal cell-type specific accessible regions in neuronal diseases such as Alzheimer’s disease and schizophrenia (Forrest et al., 2017; Schwartzentruber et al., 2018; Zhang et al., 2020)—there still remains a lack of information on the downstream genes that are affected by defective regulatory elements. Cell fate specification consists of multiple steps, necessitating precise control of gene regulation (Frank et al., 2015) and we have provided a framework that is able to accurately identify genes involved in early neuronal differentiation from iPSCs to NPCs through the integrated analysis of chromatin accessibility, histone modification and chromatin spatial organization.
While the role of NEUROD1 in inducing changes to chromatin accessibility during direct reprogramming process (Matsuda et al., 2019) has been recognized, other factors that cooperate with NEUROD1 are still unknown, particularly during the intrinsic developmental process from iPSC into NPC. To the best of our knowedege, our study is the first report to fully lineate the following: 1) localization of chromatin sites that are first opened, 2) corresponding TFs that bind to these sites, and 3) initial genes under regulation of the TFs during neuronal differentiation. Although there remains a need to characterize expression dynamics of NeuroD1 during iPSC to NPC differentiation, this study sheds light on the understanding of normal early neuronal development and holds potential to be used as a resource in comparing with neuronal disease model using iPSC.
This work was supported by Samsung Science and Technology Foundation Project (SSTF-BA1601-13).
W.Y.C. and Y.J.K. conceived and designed the study. W.Y.C. and A.J.L. produced and processed next generation sequencing data and W.Y.C. analyzed the data. A.N.C. performed cellular work for iPSC maintaining and differentiation. I.J. and S.W.C. contributed advice in experiment and data analysis. W.Y.C., J.H.H., and L.K.K. wrote the manuscript. L.K.K. and Y.J.K. supervised the project.
The authors have no potential conflicts of interest to disclose.
. Target genes linked with enhancers that are associated with NPC-specific ATAC peaks.
Enhancer-TF group | Genes | ||
---|---|---|---|
Group 1 | NEUROD1 and HOX diminished | ||
Organ morphogenesis | 1.80E-02 | ||
Cell differentiation | 1.50E-02 | ||
Hippo signaling pathway | 1.90E-03 | ||
Group 2 | HOX dominant | ||
Neural tube development | 1.60E-03 | ||
Wnt signaling | 2.20E-02 | ||
Group 3 | NEUROD1 with PBX3 and PKNOX1 | ||
Canonical Wnt signaling pathway | 2.90E-02 | ||
Cell cycle | 3.30E-02 | ||
Group 4 | NEUROD1 with ONECUT1 | ||
Neural crest cell migration | 5.80E-04 | ||
Neural tube closure | 9.10E-03 | ||
Axon guidance | 3.30E-02 | ||
Group 5 | NEUROD1 dominant | ||
Positive regulation of cell proliferation | 2.00E-02 | ||
Negative regulation of keratinocyte proliferation | 3.40E-02 | ||
Positive regulation of Notch signaling pathway | 2.70E-02 |
We linked enhancers and genes by Hi-C interaction analysis with 5 kb resolution. Interactions with a ≥ 2-fold change above the background signal and a > 5 of normalized interaction counts were included in the analysis. P values were calculated using the Benjamini–Hochberg method in DAVID..
Hongwoo Lee and Pil Joon Seo
Mol. Cells 2021; 44(12): 883-892 https://doi.org/10.14348/molcells.2021.0014Kyukwang Kim, Junghyun Eom, and Inkyung Jung
Mol. Cells 2019; 42(7): 512-522 https://doi.org/10.14348/molcells.2019.0137Lusha Ji, Rui Xu, Longtao Lu, Jiedao Zhang, Guodong Yang, Jinguang Huang, Changai Wu, and Chengchao Zheng
Mol. Cells 2013; 36(2): 127-137 https://doi.org/10.1007/s10059-013-0092-z