Mol. Cells 2022; 45(8): 522-530
Published online August 5, 2022
https://doi.org/10.14348/molcells.2022.0060
© The Korean Society for Molecular and Cellular Biology
Correspondence to : khs307@pusan.ac.kr
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.
Transposable elements (TEs) account for approximately 45% of the human genome. TEs have proliferated randomly and integrated into functional genes during hominoid radiation. They appear as right-handed B-DNA double helices and slightly elongated left-handed Z-DNAs. Human endogenous retrovirus (HERV) families are widely distributed in human chromosomes at a ratio of 8%. They contain a 5′-long terminal repeat (LTR)-gag-pol-env-3′-LTR structure. LTRs contain the U3 enhancer and promoter region, transcribed R region, and U5 region. LTRs can influence host gene expression by acting as regulatory elements. In this review, we describe the alternative promoters derived from LTR elements that overlap Z-DNA by comparing Z-hunt and DeepZ data for human functional genes. We also present evidence showing the regulatory activity of LTR elements containing Z-DNA in GSDML. Taken together, the regulatory activity of LTR elements with Z-DNA allows us to understand gene function in relation to various human diseases.
Keywords gene function, human diseases, human endogenous retrovirus, long terminal repeat elements, Z-DNA
The human genome contains several transposable elements (TEs) introduced by exogenous retroviral infection in the germline cells of our ancestors (Lower et al., 1996). Human endogenous retroviruses (HERVs) with autonomous retroelements are the most well-known retrotransposons (Havecker et al., 2004). HERV insertions comprise two long terminal repeats (LTRs) flanking an internal region that encodes protein-coding genes (gag, pol, env) necessary for retroviral replication and propagation (Jern and Coffin, 2008). Using reverse transcriptase (RTase) encoded by the pol gene of HERVs, the genes randomly integrated into the human genome (Supplementary Fig. S1A) (Kim et al., 2004; Yi et al., 2004). They then processed multiple duplication events during hominoid radiation. HERVs comprise up to 8% of the human genome and are dispersed throughout the genome (Medstrand and Mager, 1998). They cause many mutation events, including deletion of subgenomes, insertions of other transposons (Alu, LINE, ERV, and DNA transposons), and homologous recombination between the 5′-LTR and 3′-LTR of HERV sequences. Both LTRs of one HERV show more similarity than with the LTR sequence of any other HERV; therefore, they allow the production of a solitary LTR element (Huh et al., 2006; Medstrand and Mager, 1998; Thomas et al., 2018). The structure of LTR elements includes a hormone responsive element, enhancer, promoter TATA box (located within the U3 region), polyadenylation signal AATAAA (located within the R region), and the U5 region (Supplementary Fig. S1B) (Sverdlov, 2000). LTRs can influence host gene expression by acting as regulatory elements (promoters or enhancers) (Durnaoglu et al., 2021; Montension et al., 2018; Ruda et al., 2004).
A left-handed double-helical Z-DNA fragment was identified using X-ray diffraction analysis (Dickerson et al., 1982; Drew et al., 1980; Wang et al., 1981). Purine-pyrimidine alternating sequences, such as poly(dT-dG)-poly(dC-dA), have been shown to adopt the Z-DNA conformation in the presence of high CsCl concentrations and in ethanolic solutions (Zimmer et al., 1982). Stretches of the dC-dG alternating sequence [Z(C-G) element] were found to be moderately repetitive in human, mouse, and salmon genomes (Hamada et al., 1982). The abundant occurrence and evolutionary conservation of the Z(T-G) and Z(C-G) elements could have important biological implications as they could be involved in regulating gene expression and act as hotspots for gene recombination or rearrangement (Hamada et al., 1982). Computer programs (Z-hunt and Z-hunt-II) have been developed to search for genomic sequences in regions most likely to adopt the Z-conformation (Ho et al., 1986; Schroth et al., 1992). The recently developed deep-learning approach, DeepZ, aggregates information from genome-wide maps of epigenetic markers, transcription factors, RNA polymerase-binding sites, and chromosome accessibility maps (Beknazarov et al., 2020).
Z-DNA has been found to form in actively transcribed regions of the genome and has been confirmed using ChIP-Seq, indicating that Z-DNA formation depends on chromatin structure as well as sequence composition and is associated with active transcription in human cells (Shin et al., 2016). Potential Z-DNA-forming sequences (ZFS) are abundant near the transcriptional start sites of genes (Li et al., 2009; Schroth et al., 1992). This suggests that Z-DNA plays a biological role in transcriptional regulation and that RNA polymerase II accumulates local negative supercoiling, creating a suitable environment for Z-DNA formation (Herbert and Rich, 2001; Liu and Wang, 1987). For instance, the highly conserved negative regulatory element (NRE) at the 5'-UTR of the human
Retrotransposon activity is linked with Z-DNA-forming sites that overlap with recombination hotspots (Blaho and Wells, 1989; Wahls et al., 1990). A large portion of ZFS are enriched in promoter regions and contain sequences with high potential to form Z-DNA. The Z-DNA-forming sites identified using ChIP-Seq are associated with actively transcribed regions (Shin et al., 2016). ZFSs are also abundant in transposable elements (Alu) (Herbert, 2019). Alternative splicing and Z-formation appear in genes with Alu repeats and dsRNA editing of transcripts. Homologous recombination between the 5′-LTR and 3′-LTR of HERVs results in excision of structural genes (
Promoters regulate the transcription of exons located in downstream positions. Over half of human genes contain more than one promoter, which are collectively described as alternative promoters. Alternative promoters provide transcript diversity and confer dimensional complexity to cells (Landry et al., 2003). They also have different tissue specificities, developmental activities, and expression levels (Medstrand et al., 2001; Schon et al., 2009). Alternative promoters contribute to expression diversity as they create mRNA isoforms by expanding the choice of transcription initiation sites in a gene (Jacox et al., 2010).
HERV LTR elements have a potential evolutionary role in enhancing the coding capacity and regulatory versatility of the genome without compromising its integrity (Sorek, 2007). Moreover, they increase genome plasticity and provide beneficial effects for the species by providing alternative promoters (Akopov et al., 1998; Sverdlov, 2000). Most protein-coding genes in humans are regulated by multiple distinct promoters, suggesting that promoter choice is as important as the level of transcriptional activity. Transcriptome diversity is the key to cellular identity. Although most HERV elements appear inactive, some are still transcribed and translated in specific human tissues (Lower et al., 1996). In our previous study, we examined the LTR10A element located upstream of the original promoter sequence of
HERV and solitary LTR elements can cause several human diseases such as azoospermia, multiple sclerosis, schizophrenia, diabetes, and cancer (Conrad et al., 1997; Kamp et al., 2000; Karlsson et al., 2001; Kim et al., 1999a; 1999b; Li et al., 2020; Patzke et al., 2002; Perron et al., 1997; Xiao and Xu, 2021). Apolipoprotein C1 (
This work was supported by a two-year research grant from Pusan National University.
D.H.L., W.H.B., and H.H. conceived the research and performed Z-DNA analyses. E.G.P., Y.J.L., and W.R.K. provided comments. H.S.K. wrote and revised the manuscript.
The authors have no potential conflicts of interest to disclose.
Functional genes containing alternative promoter derived from LTR elements overlapping Z-DNA prediction site
Genes | NCBI Gene ID | Loci | TE types |
---|---|---|---|
57134 | 1p36.11 | MER52A-ERV1 | |
7634 | 3q13.31 | LTR12C-ERV1 | |
202299 | 5q15 | LTR12C-ERV1 | |
64288 | 6p22.1 | HERV18 int-ERVL | |
223075 | 7p14.3 | LTR12D-ERV1 | |
401337 | 7p12.1 | MLT2B3-ERVL | |
10086 | 8q24.22 | HERV-H int-ERV1 | |
54742 | 8q24.3 | LTR43-ERV1 | |
3641 | 9p24.1 | LTR22B/HML-5 | |
143501 | 11p15.4 | LTR18B-ERV1 | |
56980 | 11q24.3 | LTR52-ERVL | |
439916 | 12q21.33 | LTR60-ERV1 | |
10232 | 16p13.3 | MER54B-ERVL | |
146183 | 16p12.2 | LTR45B-ERV1 | |
6359 | 17q12 | MER50-ERV1 | |
55106 | 17q12 | MER51A-ERV1 | |
55876 | 17q21.1 | LTR7B-ERV1 | |
341 | 19q13.32 | LTR2/HERV-E | |
29075 | 20p11.23 | MLTF-ERVL |
Mol. Cells 2022; 45(8): 522-530
Published online August 31, 2022 https://doi.org/10.14348/molcells.2022.0060
Copyright © The Korean Society for Molecular and Cellular Biology.
Du Hyeong Lee1,4,5 , Woo Hyeon Bae1,4,5
, Hongseok Ha2,5
, Eun Gyung Park1,4
, Yun Ju Lee1,4
, Woo Ryung Kim1,4
, and Heui-Soo Kim3,4, *
1Department of Integrated Biological Sciences, Pusan National University, Busan 46241, Korea, 2Division of Life Sciences, Korea University, Seoul 02841, Korea, 3Department of Biological Sciences, College of Natural Sciences, Pusan National University, Busan 46231, Korea, 4Institute of Systems Biology, Pusan National University, Busan 46241, Korea, 5These authors contributed equally to this work.
Correspondence to:khs307@pusan.ac.kr
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.
Transposable elements (TEs) account for approximately 45% of the human genome. TEs have proliferated randomly and integrated into functional genes during hominoid radiation. They appear as right-handed B-DNA double helices and slightly elongated left-handed Z-DNAs. Human endogenous retrovirus (HERV) families are widely distributed in human chromosomes at a ratio of 8%. They contain a 5′-long terminal repeat (LTR)-gag-pol-env-3′-LTR structure. LTRs contain the U3 enhancer and promoter region, transcribed R region, and U5 region. LTRs can influence host gene expression by acting as regulatory elements. In this review, we describe the alternative promoters derived from LTR elements that overlap Z-DNA by comparing Z-hunt and DeepZ data for human functional genes. We also present evidence showing the regulatory activity of LTR elements containing Z-DNA in GSDML. Taken together, the regulatory activity of LTR elements with Z-DNA allows us to understand gene function in relation to various human diseases.
Keywords: gene function, human diseases, human endogenous retrovirus, long terminal repeat elements, Z-DNA
The human genome contains several transposable elements (TEs) introduced by exogenous retroviral infection in the germline cells of our ancestors (Lower et al., 1996). Human endogenous retroviruses (HERVs) with autonomous retroelements are the most well-known retrotransposons (Havecker et al., 2004). HERV insertions comprise two long terminal repeats (LTRs) flanking an internal region that encodes protein-coding genes (gag, pol, env) necessary for retroviral replication and propagation (Jern and Coffin, 2008). Using reverse transcriptase (RTase) encoded by the pol gene of HERVs, the genes randomly integrated into the human genome (Supplementary Fig. S1A) (Kim et al., 2004; Yi et al., 2004). They then processed multiple duplication events during hominoid radiation. HERVs comprise up to 8% of the human genome and are dispersed throughout the genome (Medstrand and Mager, 1998). They cause many mutation events, including deletion of subgenomes, insertions of other transposons (Alu, LINE, ERV, and DNA transposons), and homologous recombination between the 5′-LTR and 3′-LTR of HERV sequences. Both LTRs of one HERV show more similarity than with the LTR sequence of any other HERV; therefore, they allow the production of a solitary LTR element (Huh et al., 2006; Medstrand and Mager, 1998; Thomas et al., 2018). The structure of LTR elements includes a hormone responsive element, enhancer, promoter TATA box (located within the U3 region), polyadenylation signal AATAAA (located within the R region), and the U5 region (Supplementary Fig. S1B) (Sverdlov, 2000). LTRs can influence host gene expression by acting as regulatory elements (promoters or enhancers) (Durnaoglu et al., 2021; Montension et al., 2018; Ruda et al., 2004).
A left-handed double-helical Z-DNA fragment was identified using X-ray diffraction analysis (Dickerson et al., 1982; Drew et al., 1980; Wang et al., 1981). Purine-pyrimidine alternating sequences, such as poly(dT-dG)-poly(dC-dA), have been shown to adopt the Z-DNA conformation in the presence of high CsCl concentrations and in ethanolic solutions (Zimmer et al., 1982). Stretches of the dC-dG alternating sequence [Z(C-G) element] were found to be moderately repetitive in human, mouse, and salmon genomes (Hamada et al., 1982). The abundant occurrence and evolutionary conservation of the Z(T-G) and Z(C-G) elements could have important biological implications as they could be involved in regulating gene expression and act as hotspots for gene recombination or rearrangement (Hamada et al., 1982). Computer programs (Z-hunt and Z-hunt-II) have been developed to search for genomic sequences in regions most likely to adopt the Z-conformation (Ho et al., 1986; Schroth et al., 1992). The recently developed deep-learning approach, DeepZ, aggregates information from genome-wide maps of epigenetic markers, transcription factors, RNA polymerase-binding sites, and chromosome accessibility maps (Beknazarov et al., 2020).
Z-DNA has been found to form in actively transcribed regions of the genome and has been confirmed using ChIP-Seq, indicating that Z-DNA formation depends on chromatin structure as well as sequence composition and is associated with active transcription in human cells (Shin et al., 2016). Potential Z-DNA-forming sequences (ZFS) are abundant near the transcriptional start sites of genes (Li et al., 2009; Schroth et al., 1992). This suggests that Z-DNA plays a biological role in transcriptional regulation and that RNA polymerase II accumulates local negative supercoiling, creating a suitable environment for Z-DNA formation (Herbert and Rich, 2001; Liu and Wang, 1987). For instance, the highly conserved negative regulatory element (NRE) at the 5'-UTR of the human
Retrotransposon activity is linked with Z-DNA-forming sites that overlap with recombination hotspots (Blaho and Wells, 1989; Wahls et al., 1990). A large portion of ZFS are enriched in promoter regions and contain sequences with high potential to form Z-DNA. The Z-DNA-forming sites identified using ChIP-Seq are associated with actively transcribed regions (Shin et al., 2016). ZFSs are also abundant in transposable elements (Alu) (Herbert, 2019). Alternative splicing and Z-formation appear in genes with Alu repeats and dsRNA editing of transcripts. Homologous recombination between the 5′-LTR and 3′-LTR of HERVs results in excision of structural genes (
Promoters regulate the transcription of exons located in downstream positions. Over half of human genes contain more than one promoter, which are collectively described as alternative promoters. Alternative promoters provide transcript diversity and confer dimensional complexity to cells (Landry et al., 2003). They also have different tissue specificities, developmental activities, and expression levels (Medstrand et al., 2001; Schon et al., 2009). Alternative promoters contribute to expression diversity as they create mRNA isoforms by expanding the choice of transcription initiation sites in a gene (Jacox et al., 2010).
HERV LTR elements have a potential evolutionary role in enhancing the coding capacity and regulatory versatility of the genome without compromising its integrity (Sorek, 2007). Moreover, they increase genome plasticity and provide beneficial effects for the species by providing alternative promoters (Akopov et al., 1998; Sverdlov, 2000). Most protein-coding genes in humans are regulated by multiple distinct promoters, suggesting that promoter choice is as important as the level of transcriptional activity. Transcriptome diversity is the key to cellular identity. Although most HERV elements appear inactive, some are still transcribed and translated in specific human tissues (Lower et al., 1996). In our previous study, we examined the LTR10A element located upstream of the original promoter sequence of
HERV and solitary LTR elements can cause several human diseases such as azoospermia, multiple sclerosis, schizophrenia, diabetes, and cancer (Conrad et al., 1997; Kamp et al., 2000; Karlsson et al., 2001; Kim et al., 1999a; 1999b; Li et al., 2020; Patzke et al., 2002; Perron et al., 1997; Xiao and Xu, 2021). Apolipoprotein C1 (
This work was supported by a two-year research grant from Pusan National University.
D.H.L., W.H.B., and H.H. conceived the research and performed Z-DNA analyses. E.G.P., Y.J.L., and W.R.K. provided comments. H.S.K. wrote and revised the manuscript.
The authors have no potential conflicts of interest to disclose.
. Functional genes containing alternative promoter derived from LTR elements overlapping Z-DNA prediction site.
Genes | NCBI Gene ID | Loci | TE types |
---|---|---|---|
57134 | 1p36.11 | MER52A-ERV1 | |
7634 | 3q13.31 | LTR12C-ERV1 | |
202299 | 5q15 | LTR12C-ERV1 | |
64288 | 6p22.1 | HERV18 int-ERVL | |
223075 | 7p14.3 | LTR12D-ERV1 | |
401337 | 7p12.1 | MLT2B3-ERVL | |
10086 | 8q24.22 | HERV-H int-ERV1 | |
54742 | 8q24.3 | LTR43-ERV1 | |
3641 | 9p24.1 | LTR22B/HML-5 | |
143501 | 11p15.4 | LTR18B-ERV1 | |
56980 | 11q24.3 | LTR52-ERVL | |
439916 | 12q21.33 | LTR60-ERV1 | |
10232 | 16p13.3 | MER54B-ERVL | |
146183 | 16p12.2 | LTR45B-ERV1 | |
6359 | 17q12 | MER50-ERV1 | |
55106 | 17q12 | MER51A-ERV1 | |
55876 | 17q21.1 | LTR7B-ERV1 | |
341 | 19q13.32 | LTR2/HERV-E | |
29075 | 20p11.23 | MLTF-ERVL |
Serpen Durnaoglu, Sun-Kyung Lee, and Joohong Ahnn
Mol. Cells 2021; 44(12): 861-878 https://doi.org/10.14348/molcells.2021.5016