Mol. Cells 2019; 42(4): 333-344
Published online March 26, 2019
https://doi.org/10.14348/molcells.2019.2442
© The Korean Society for Molecular and Cellular Biology
Correspondence to : *schoi@kangwon.ac.kr
Various genetic and environmental factors are known to be associated with chronic obstructive pulmonary disease (COPD). We identified COPD-related differentially expressed genes (DEGs) using 189 samples accompanying either adenocarcinoma (AC) or squamous cell carcinoma (SC), comprising 91 normal and 98 COPD samples. DEGs were obtained from the intersection of two DEG sets separately identified for AC and SC to exclude the influence of different cancer backgrounds co-occurring with COPD. We also measured patient samples named group ‘I’, which were unable to be determined as normal or COPD based on alterations in gene expression. The Gene Ontology (GO) analysis revealed significant alterations in the expression of genes categorized with the ‘cell adhesion’, ‘inflammatory response’, and ‘mitochondrial functions’, i.e., well-known functions related to COPD, in samples from patients with COPD. Multi-omics data were subsequently integrated to decipher the upstream regulatory changes linked to the gene expression alterations in COPD. COPD-associated expression quantitative trait loci (eQTLs) were located at the upstream regulatory regions of 96 DEGs. Additionally, 45 previously identified COPD-related miRNAs were predicted to target 66 of the DEGs. The eQTLs and miRNAs might affect the expression of ‘respiratory electron transport chain’ genes and ‘cell proliferation’ genes, respectively, while both eQTLs and miRNAs might affect the expression of ‘apoptosis’ genes. We think that our present study will contribute to our understanding of the molecular etiology of COPD accompanying lung cancer.
Keywords COPD, differentially expressed genes, lung cancer, regulatory alterations,
Chronic obstructive pulmonary disease (COPD), a complex age-related disease, has two components: chronic bronchitis, characterized by productive cough, and emphysema, demonstrated by destruction of the lung parenchyma (Agustí et al., 2012; Mannino and Buist, 2007; Rabe et al., 2007; Vestbo et al., 2013). COPD patients also suffer from shortness of breath due to chronic airway obstruction and inflammation (Fabbri et al., 2003; Vestbo et al., 2013). COPD is generally diagnosed by chronic and irreparable impairment of lung airflow (Aaron et al., 2007; Barnett, 2005; Calverley et al., 2007; Vestbo et al., 2013). The most prominent cause of COPD is cigarette smoking (CS), although this factor is not the only cause of the disease, and not all smokers have the disease (Buist et al., 2008; Kim et al., 2017; Lundback et al., 2003; Mannino et al., 2003; Stone et al., 1983; Swanney et al., 2008).
Several recent studies have shown that genetic factors contribute to COPD (Anderson and Bozinovski, 2003; Foreman et al., 2012; Jeong et al., 2018; Sakao et al., 2003; Sandford et al., 1998). For instance, loss-of-function mutation of α1-antitrypsin is a well-known genetic risk factor, although it is responsible for only 1–5% of COPD patients (Smith and Harrison, 1997; Stoller and Aboussouan, 2005). In addition, with the recent success of genome-wide association studies (GWAS), the list of COPD-associated genes is expanding rapidly, including FAM13A, HHIP, IREB2, RAB4B, EGLN2, MIA, CYP2A6 (Hardin and Silverman, 2014; Hobbs et al., 2017; Kim and Lee, 2015). Note that many COPD-associated single nucleotide polymorphisms (SNPs) are located in noncoding regions such as intergenic and intronic regions (Artigas et al., 2011; Repapi et al., 2010) rather than in protein-coding regions, which is also the case for other disease-associated SNPs. These noncoding but disease-associated SNPs may contribute to the altered regulation of gene expression, splicing, and epigenetic modifications.
Interestingly, CS can perturb gene expression by affecting various epigenetic markers including DNA methylation and chromatin modification (Belinsky et al., 2002; Kim et al., 2001; Lee and Pausova, 2013). A variety of epigenetic machineries for regulating downstream gene expression are known to be altered in COPD (Lawless et al., 2009; Schamberger et al., 2014). All these studies consistently indicate that both genetic and epigenetic alterations are important in the etiology of COPD. However, the mechanism by which these genes and their genetic mutations mediate the pathogenesis of the disease remains to be elucidated.
Meanwhile, an understanding of the perturbations in gene expression in various diseases will potentially contribute to the identification of molecular targets to develop new therapeutic drugs or prognostic modalities. Transcriptome studies using either microarrays or RNA-Seq approaches have been employed for those purposes in COPD as well (Chen et al., 2008; Kim et al., 2015a; Rangasamy et al., 2009; Steiling et al., 2013; Wang et al., 2008). For instance, according to Wang et al. (2008), extracellular matrix (ECM) and apoptosis genes are upregulated, whereas genes involved in anti-inflammatory functions are down-regulated in a microarray of 48 human lung samples, including normal tissues and samples from patients with various stages of COPD ranging from GOLD (Global Initiative for Chronic Obstructive Lung Diseases) stage 0 to GOLD stage 3. In addition, several other genes involved in inflammatory responses, including cytokines and chemokines, and genes involved in oxidative stress responses are associated with COPD progression (Chen et al., 2008; Kim et al., 2015a; Rangasamy et al., 2009; Steiling et al., 2013).
However, alterations in gene expression are notably heterogeneous among studies with different designs using different cell types (Novak et al., 2002), which is not surprising because 12 different cell types constitute the whole lung tissue (Wang et al., 2008). Moreover, most subjects with COPD were also diagnosed with lung cancers, which was the reason for undergoing lung resection, and thus the DEG results cannot be as easily interpreted as the genes that are altered by COPD alone (Wang et al., 2008). As shown in the study by Spira et al. (2007), gene expressions in normal airway epithelial cells derived from patients with lung cancers differ in a cancer-specific manner.
In the present study, we thus attempt to avoid complexities in estimating DEGs associated with COPD (i.e., COPD-DEGs) driven by different cancer backgrounds co-occurring with COPD or by likely misclassified patient samples named group ‘I’, basically by revisiting the previous study published by Kim et al. (2015a). In addition, we integrate the COPD-DEGs with various multi-omics data, revealing novel insights relevant to COPD, which may ultimately contribute to improving our understanding of the molecular etiology of COPD or to identifying molecular targets for the development of a novel diagnostic or prognostic strategy.
For the present study, we obtained the RNA-Seq data produced from COPD patients with lung cancers from Kim et al. (2015a) (
FPKM values downloaded from the GEO database (
The 189 patient samples in the GSE57148, comprising 91 samples labeled ‘normal’ and 98 samples labeled ‘COPD’, were further divided into COPD coupled with adenocarcinoma (AC-COPD; 58 normal versus 36 COPD) and COPD coupled with squamous cell carcinoma (SC-COPD; 33 normal versus 62 COPD). Detailed clinical information on the sex, age, FEV1/FVC ratio, and cancer types of the samples was retrieved from Kim et al. (2015a) and is provided as
A total of 45 differentially expressed miRNAs (DE-miRNAs) detected in COPD were obtained from Kim and Lee (2017). Information about miRNAs and their target mRNAs was obtained from miRTarBase (ver. 7.0,
A total of 910 previously known COPD-associated genes were obtained from the gene-disease association (GDA) database of DisGeNET (ver. 5.0,
All statistical tests and their related diagrams were implemented with R (ver. 3.5.0,
Total RNA was extracted from two types of mouse COPD models, i.e., five mice from the elastase-induced model (Suki et al., 2017) and three mice from the smoking-induced model (Cavarra et al., 2001); please refer to previous studies to find detailed protocols for constructing mouse COPD model systems (Cavarra et al., 2001; Huh et al., 2011; Kim et al., 2015b; Suki et al., 2017). Total RNA was isolated using the RNeasy Mini Kit (Qiagen, Germany) according to the manufacturer’s instructions, and the concentration was quantified with an Epoch Microplate Spectrophotometer (Biotek Instruments, Inc.). The quality of total RNAs was then tested by measuring the ratio of absorbance at 260 nm and 280 nm. Reverse transcription was conducted using the TOPscriptδ RT DryMIX kit (Enzynomics); for PCR, 1 μl of synthesized cDNA was used with AccuPower PCR pre-MIX (Bioneer). Quantitative real-time PCR (qPCR) was performed to quantify the mRNA expression of candidate genes using the ABI Step One Plus System Instrument (Applied Biosystems). The qPCRs were performed according to the manufacturer’s instructions with TOPrealδ qPCR 2X PreMIX SYBR Green with high ROX (Enzynomics) in a Microamp fast 96-well reaction plate. Levels of relative gene expression were normalized to GAPDH expression. The primer sequences are provided in
We collected RNA-Seq data from a total of 189 samples from the Gene Expression Omnibus (GEO) database (see Methods) to revisit the previous findings (Kim et al., 2015a) regarding COPD-associated gene expression signatures. As described in the original paper, the samples were all derived from either adenocarcinoma (AC) or squamous cell carcinoma (SC) of the lungs, regardless of whether they were labeled normal or COPD. Therefore, the terms ‘normal’ and ‘COPD’ here refer to lung cancer patients without COPD and lung cancer patients with COPD, respectively. We thus decided to classify COPD patients into two groups by the lung cancer types from which they suffered simultaneously, i.e., COPD patients with adenocarcinoma (AC-COPD) and COPD patients with squamous cell carcinoma (SC-COPD)(Fig. 1). Accordingly, the samples from AC and SC patients without COPD were named ‘AC-normal’ and ‘SC-normal’, respectively.
Consistent with the study by Kim et al. (2015), patients with COPD tended to be older, to have more pack-years of CS, and to have a significantly lower FEV1/FVC ratio than individuals without COPD (
We separately identified DEGs for the two COPD groups classified by the two cancer types, AC and SC, using cutoff values of Q < 0.01 and |fold change (FC)| ≥ 1.5 (see Methods). As a result, 150 DEGs were identified between 58 AC-normal and 36 AC-COPD samples (named AC-DEGs), and 58 DEGs were identified between 33 SC-normal and 62 SC-COPD samples (named SC-DEGs).
We conducted heatmap analysis accompanied by unsupervised hierarchical clustering for AC and SC to investigate whether these DEGs demarcated the normal and COPD samples (Figs. 2A and 2D). The classification of samples by the DEGs was not perfect in either cancer types (i.e., some normal samples are clustered in an intermingled way with some COPD samples, and vice versa), leading to the recognition of a third group of patient samples, separate from normal and COPD.
We thus performed
Interestingly, the median FEV1/FVC ratio of group ‘I’ was approximately equivalent to group ‘N’, although the ratio ranged widely from less than 50% (clinical COPD) to greater than 80% (clinically normal) for both AC and SC (Figs. 2C and 2F). Based on this result, group ‘I’ may contain misclassified patient samples, although they had been clinically diagnosed with or without COPD.
After establishing a third group of patients, group ‘I’, we decided to re-identify the three sets of DEGs to determine how the gene expression patterns of the patients in group ‘I’ e differed. We defined NC-DEGs, NI-DEGs, IC-DEGs by comparing gene expression between groups ‘N’ and ‘C’, ‘N’ and ‘I’, and ‘I’ and ‘C’, respectively. Several statistical cutoff values for defining the three sets of DEGs were tested, from which we found that very few NI- and IC-DEGs remained when Q < 0.01 and | FC| ≥ 2 were applied. However, 237 NC-DEGs still remained as DEGs at those stringent cutoffs (
Next, we tried to obtain common COPD-DEGs between AC-COPD and SC-COPD groups by collecting common NC-DEGs identified based on the intersection between the NC-DEG sets of AC and SC. As shown in Fig. 3, 237 common NC-DEGs (the bottom panel), including 131 upregulated and 106 downregulated genes, differentiated COPD from normal samples without ambiguity for both AC (the top left panel) and SC (the top right panel) using a heatmap accompanied by unsupervised hierarchical clustering. Notably, samples from group ‘I’ were excluded from this analysis of both AC and SC.
The intersection between NC-DEGs for AC and SC produced two additional groups of gene sets, ‘AC-specific’ (316 genes) and ‘SC-specific’ (62 genes), as well as the common NC-DEGs (Fig. 4). Moreover, according to the direction of the changes in expression (up- or downregulation), these three categories of NC-DEGs were divided into six categories: ‘Common-up’ (131 genes), ‘Common-down’ (106 genes), ‘AC-specific-up’ (140 genes), ‘AC-specific-down’ (176 genes), ‘SC-specific-up’ (29 genes), and ‘SC-specific-down’ (33 genes)(Fig. 4A).
We performed a GO analysis of each of the 6 categories of DEGs, confirming several previously well-known COPD-related genes in the ‘common’ category. For instance, GO functions such as ‘inflammatory response’, ‘cell adhesion’, and ‘ECM’ were significantly enriched in the ‘Common-up’ category, whereas mitochondria-related functions were significantly enriched in the ‘Common-down’ category (Fig. 4A). Interestingly, all these functional GO terms were completely consistent with the findings from recent reviews aiming to decipher the molecular pathology of COPD (Chen et al., 2008; Wang et al., 2008). Similar GO functional terms appeared in both AC-specific and SC-specific categories, although the numbers of genes associated with specific functional terms differed, suggesting that genes involved in molecular pathways associated with COPD etiology were commonly altered as well. Notably, however, inflammatory and apoptosis genes were prominent functional terms for the AC-specific category, but not the SC-specific category, whereas cell proliferation genes showed the opposite trend (Fig. 4A). We postulate that this difference in functional GO terms exists because the etiology of COPD in patients with AC and SC is not identical due to different genetic and epigenetic conditions associated with each cancer type. Notably, the GSEA generally provided the same interpretation as the GO analysis (data not shown).
We then created a functional interaction network with the ReactomeFI Cytoscape plugin (see the Methods) using these 615 NC-DEGs assigned into the six different categories (Fig. 4B), confirming that these genes, regardless of the categories to which they are assigned, are associated with COPD. Most of the input DEGs were inter-connected in a large network and the functional terms of each clustered subgroup in the network were consistent with the functions mentioned above, suggesting that these DEGs may participate in the pathogenic pathway leading to COPD. The integration of an additional source of functional information about these DEGs, i.e., previously known COPD-associated genes, including
We next attempted to further characterize the 237 common NC-DEGs by integrating them with other multi-omics data. Previously, Kim and Lee (2017) reported a total of 45 miRNAs to be differentially expressed in COPD patients (i.e., DE-miRNAs) compared with normal controls. It is hypothesized that the alteration of miRNA expression could lead to alterations in target gene expression, considering that miRNAs are well-established regulators of gene expression. Consequently, we searched the 237 common NC-DEGs for genes predicted to be targeted by these 45 DE-miRNAs by examining the intersection of these DEGs with previously known miRNA-target mRNA pairs (see the Methods). As a result, 43 of the 45 miRNAs were associated with a total of 4,786 mRNA targets. By overlapping the 237 DEGs with the 4,786 miRNA targets, 66 NC-DEGs were identified (Tables 1 and
Another mechanism of gene expression alteration is genetic mutations that occur in
We then examined whether a general pattern occurred in the mechanism of regulatory alterations, mediated either by miRNAs or by eQTLs, in functional categories into which the common NC-DEGs were assigned. Two groups of DEGs, i.e., DEGs targeted by the 45 DE-miRNAs and DEGs mapped to eQTLs, were overlapped, resulting in three categories of DEGs: genes whose expression was altered only by DE-miRNAs (designated ‘miRNA-specific’), genes whose expression was altered only by regulatory mutations (designated ‘eQTL-specific’), and whose expression was altered by both mechanisms (designated ‘both’). As shown in Table 1, only 20 NC-DEGs, including
Here, we identified genes that were significantly altered by COPD using samples previously published by Kim et al. (2015a). We not only confirmed the previous finding, showing the perturbation in the gene expression of inflammatory genes and mitochondrial genes, but also revealed a novel aspect regarding the changes in putative regulatory regions that might affect changes in COPD-DEG expression.
A problem confronted by Kim et al. in their study (Kim et al., 2015) and by us in the present study was that all COPD samples were accompanied by lung cancers, either AC or SC. However, a substantial difference between the study by Kim et al. (2015) and our present study is the strategy used to identify DEGs; we attempted to remove the bias driven by lung cancers in patients with COPD by detecting COPD-DEGs. Another difference was the recognition of an ‘intermediate’ group named group ‘I’, i.e., patients who cannot be classified as COPD or non-COPD by gene expression patterns alone. We observed few differences in gene expression levels between the ‘N’ and ‘I’ groups and between the ‘I’ and ‘C’ groups, and the FEV1/FVC ratio of group ‘I’ was scattered from low to high, indicating that COPD and non-COPD samples were likely intermingled within group ‘I’. In other words, the clinical diagnosis of patients in group ‘I’ as either COPD or normal might have been inaccurate, potentially leading to the identification of less reliable COPD-DEGs if group ‘I’ samples were included in the datasets. We postulate that the exclusion of samples from group ‘I’ and the collection of NC-DEGs helped us unambiguously identify reliable COPD-DEGs. In fact, the subsequent analyses performed after DEG identification, including the GO analysis and functional interaction network analysis, confirmed the previous findings from COPD-driven transcriptome analyses, as the expression of inflammatory genes and apoptosis genes was upregulated, and the expression of oxidation-reduction genes was downregulated. A third difference exists with regard to our effort to integrate DEGs with various omics datasets to investigate a mechanistic question underlying the expression alteration occurring in COPD. The integration of multi-omics data, including GDA, eQTLs, and miRNAs, provided us an opportunity to link changes in upstream regulatory regions and changes in downstream gene expression, through which we identified some common pathways involved in the development of COPD, regardless of co-occurring cancer types. Additionally, different functional classes of genes were altered by different lung cancer backgrounds or by the upregulation or downregulation of genes.
Finally, we further validated alterations in the expression of some selected genes in the two mouse COPD model systems, in which
We think that our present study will contribute to understanding the molecular etiology of COPD coupled with lung cancers and to identifying diagnostic markers of COPD.
This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF-2016R1D1A1B03930411).
Results of the GO analysis
Category (n) | Gene Ontology | |
---|---|---|
miRNA specific (46) | positive regulation of smooth muscle cell proliferation | 4.26E-07 |
positive regulation of transcription from RNA polymerase II promoter | 2.61E-05 | |
positive regulation of cell proliferation | 1.64E-04 | |
cellular response to tumor necrosis factor | 2.79E-03 | |
cell adhesion | 6.11E-03 | |
Both (20) | positive regulation of ERK1 and ERK2 cascade | 9.54E-04 |
negative regulation of apoptotic process | 1.49E-03 | |
positive regulation of fibroblast proliferation | 1.68E-03 | |
positive regulation of MAP kinase activity | 2.00E-03 | |
leukocyte migration | 8.26E-03 | |
eQTL specific (76) | respiratory electron transport chain | 2.00E-03 |
lymphocyte homeostasis | 1.66E-02 | |
positive regulation of intrinsic apoptotic signaling pathway in response to DNA damage | 1.98E-02 | |
inflammatory response | 3.74E-02 | |
protein phosphorylation | 6.53E-02 |
Mol. Cells 2019; 42(4): 333-344
Published online April 30, 2019 https://doi.org/10.14348/molcells.2019.2442
Copyright © The Korean Society for Molecular and Cellular Biology.
Dong-Yeop Kim1, Woo Jin Kim2, Jung-Hyun Kim2, Seok-Ho Hong2, and Sun Shim Choi1,*
1Division of Biomedical Convergence, College of Biomedical Science, Institute of Bioscience & Biotechnology, Kangwon National University, Chuncheon 24341, Korea, 2Department of Internal Medicine, School of Medicine, Kangwon National University, Chuncheon 24341, Korea
Correspondence to:*schoi@kangwon.ac.kr
Various genetic and environmental factors are known to be associated with chronic obstructive pulmonary disease (COPD). We identified COPD-related differentially expressed genes (DEGs) using 189 samples accompanying either adenocarcinoma (AC) or squamous cell carcinoma (SC), comprising 91 normal and 98 COPD samples. DEGs were obtained from the intersection of two DEG sets separately identified for AC and SC to exclude the influence of different cancer backgrounds co-occurring with COPD. We also measured patient samples named group ‘I’, which were unable to be determined as normal or COPD based on alterations in gene expression. The Gene Ontology (GO) analysis revealed significant alterations in the expression of genes categorized with the ‘cell adhesion’, ‘inflammatory response’, and ‘mitochondrial functions’, i.e., well-known functions related to COPD, in samples from patients with COPD. Multi-omics data were subsequently integrated to decipher the upstream regulatory changes linked to the gene expression alterations in COPD. COPD-associated expression quantitative trait loci (eQTLs) were located at the upstream regulatory regions of 96 DEGs. Additionally, 45 previously identified COPD-related miRNAs were predicted to target 66 of the DEGs. The eQTLs and miRNAs might affect the expression of ‘respiratory electron transport chain’ genes and ‘cell proliferation’ genes, respectively, while both eQTLs and miRNAs might affect the expression of ‘apoptosis’ genes. We think that our present study will contribute to our understanding of the molecular etiology of COPD accompanying lung cancer.
Keywords: COPD, differentially expressed genes, lung cancer, regulatory alterations,
Chronic obstructive pulmonary disease (COPD), a complex age-related disease, has two components: chronic bronchitis, characterized by productive cough, and emphysema, demonstrated by destruction of the lung parenchyma (Agustí et al., 2012; Mannino and Buist, 2007; Rabe et al., 2007; Vestbo et al., 2013). COPD patients also suffer from shortness of breath due to chronic airway obstruction and inflammation (Fabbri et al., 2003; Vestbo et al., 2013). COPD is generally diagnosed by chronic and irreparable impairment of lung airflow (Aaron et al., 2007; Barnett, 2005; Calverley et al., 2007; Vestbo et al., 2013). The most prominent cause of COPD is cigarette smoking (CS), although this factor is not the only cause of the disease, and not all smokers have the disease (Buist et al., 2008; Kim et al., 2017; Lundback et al., 2003; Mannino et al., 2003; Stone et al., 1983; Swanney et al., 2008).
Several recent studies have shown that genetic factors contribute to COPD (Anderson and Bozinovski, 2003; Foreman et al., 2012; Jeong et al., 2018; Sakao et al., 2003; Sandford et al., 1998). For instance, loss-of-function mutation of α1-antitrypsin is a well-known genetic risk factor, although it is responsible for only 1–5% of COPD patients (Smith and Harrison, 1997; Stoller and Aboussouan, 2005). In addition, with the recent success of genome-wide association studies (GWAS), the list of COPD-associated genes is expanding rapidly, including FAM13A, HHIP, IREB2, RAB4B, EGLN2, MIA, CYP2A6 (Hardin and Silverman, 2014; Hobbs et al., 2017; Kim and Lee, 2015). Note that many COPD-associated single nucleotide polymorphisms (SNPs) are located in noncoding regions such as intergenic and intronic regions (Artigas et al., 2011; Repapi et al., 2010) rather than in protein-coding regions, which is also the case for other disease-associated SNPs. These noncoding but disease-associated SNPs may contribute to the altered regulation of gene expression, splicing, and epigenetic modifications.
Interestingly, CS can perturb gene expression by affecting various epigenetic markers including DNA methylation and chromatin modification (Belinsky et al., 2002; Kim et al., 2001; Lee and Pausova, 2013). A variety of epigenetic machineries for regulating downstream gene expression are known to be altered in COPD (Lawless et al., 2009; Schamberger et al., 2014). All these studies consistently indicate that both genetic and epigenetic alterations are important in the etiology of COPD. However, the mechanism by which these genes and their genetic mutations mediate the pathogenesis of the disease remains to be elucidated.
Meanwhile, an understanding of the perturbations in gene expression in various diseases will potentially contribute to the identification of molecular targets to develop new therapeutic drugs or prognostic modalities. Transcriptome studies using either microarrays or RNA-Seq approaches have been employed for those purposes in COPD as well (Chen et al., 2008; Kim et al., 2015a; Rangasamy et al., 2009; Steiling et al., 2013; Wang et al., 2008). For instance, according to Wang et al. (2008), extracellular matrix (ECM) and apoptosis genes are upregulated, whereas genes involved in anti-inflammatory functions are down-regulated in a microarray of 48 human lung samples, including normal tissues and samples from patients with various stages of COPD ranging from GOLD (Global Initiative for Chronic Obstructive Lung Diseases) stage 0 to GOLD stage 3. In addition, several other genes involved in inflammatory responses, including cytokines and chemokines, and genes involved in oxidative stress responses are associated with COPD progression (Chen et al., 2008; Kim et al., 2015a; Rangasamy et al., 2009; Steiling et al., 2013).
However, alterations in gene expression are notably heterogeneous among studies with different designs using different cell types (Novak et al., 2002), which is not surprising because 12 different cell types constitute the whole lung tissue (Wang et al., 2008). Moreover, most subjects with COPD were also diagnosed with lung cancers, which was the reason for undergoing lung resection, and thus the DEG results cannot be as easily interpreted as the genes that are altered by COPD alone (Wang et al., 2008). As shown in the study by Spira et al. (2007), gene expressions in normal airway epithelial cells derived from patients with lung cancers differ in a cancer-specific manner.
In the present study, we thus attempt to avoid complexities in estimating DEGs associated with COPD (i.e., COPD-DEGs) driven by different cancer backgrounds co-occurring with COPD or by likely misclassified patient samples named group ‘I’, basically by revisiting the previous study published by Kim et al. (2015a). In addition, we integrate the COPD-DEGs with various multi-omics data, revealing novel insights relevant to COPD, which may ultimately contribute to improving our understanding of the molecular etiology of COPD or to identifying molecular targets for the development of a novel diagnostic or prognostic strategy.
For the present study, we obtained the RNA-Seq data produced from COPD patients with lung cancers from Kim et al. (2015a) (
FPKM values downloaded from the GEO database (
The 189 patient samples in the GSE57148, comprising 91 samples labeled ‘normal’ and 98 samples labeled ‘COPD’, were further divided into COPD coupled with adenocarcinoma (AC-COPD; 58 normal versus 36 COPD) and COPD coupled with squamous cell carcinoma (SC-COPD; 33 normal versus 62 COPD). Detailed clinical information on the sex, age, FEV1/FVC ratio, and cancer types of the samples was retrieved from Kim et al. (2015a) and is provided as
A total of 45 differentially expressed miRNAs (DE-miRNAs) detected in COPD were obtained from Kim and Lee (2017). Information about miRNAs and their target mRNAs was obtained from miRTarBase (ver. 7.0,
A total of 910 previously known COPD-associated genes were obtained from the gene-disease association (GDA) database of DisGeNET (ver. 5.0,
All statistical tests and their related diagrams were implemented with R (ver. 3.5.0,
Total RNA was extracted from two types of mouse COPD models, i.e., five mice from the elastase-induced model (Suki et al., 2017) and three mice from the smoking-induced model (Cavarra et al., 2001); please refer to previous studies to find detailed protocols for constructing mouse COPD model systems (Cavarra et al., 2001; Huh et al., 2011; Kim et al., 2015b; Suki et al., 2017). Total RNA was isolated using the RNeasy Mini Kit (Qiagen, Germany) according to the manufacturer’s instructions, and the concentration was quantified with an Epoch Microplate Spectrophotometer (Biotek Instruments, Inc.). The quality of total RNAs was then tested by measuring the ratio of absorbance at 260 nm and 280 nm. Reverse transcription was conducted using the TOPscriptδ RT DryMIX kit (Enzynomics); for PCR, 1 μl of synthesized cDNA was used with AccuPower PCR pre-MIX (Bioneer). Quantitative real-time PCR (qPCR) was performed to quantify the mRNA expression of candidate genes using the ABI Step One Plus System Instrument (Applied Biosystems). The qPCRs were performed according to the manufacturer’s instructions with TOPrealδ qPCR 2X PreMIX SYBR Green with high ROX (Enzynomics) in a Microamp fast 96-well reaction plate. Levels of relative gene expression were normalized to GAPDH expression. The primer sequences are provided in
We collected RNA-Seq data from a total of 189 samples from the Gene Expression Omnibus (GEO) database (see Methods) to revisit the previous findings (Kim et al., 2015a) regarding COPD-associated gene expression signatures. As described in the original paper, the samples were all derived from either adenocarcinoma (AC) or squamous cell carcinoma (SC) of the lungs, regardless of whether they were labeled normal or COPD. Therefore, the terms ‘normal’ and ‘COPD’ here refer to lung cancer patients without COPD and lung cancer patients with COPD, respectively. We thus decided to classify COPD patients into two groups by the lung cancer types from which they suffered simultaneously, i.e., COPD patients with adenocarcinoma (AC-COPD) and COPD patients with squamous cell carcinoma (SC-COPD)(Fig. 1). Accordingly, the samples from AC and SC patients without COPD were named ‘AC-normal’ and ‘SC-normal’, respectively.
Consistent with the study by Kim et al. (2015), patients with COPD tended to be older, to have more pack-years of CS, and to have a significantly lower FEV1/FVC ratio than individuals without COPD (
We separately identified DEGs for the two COPD groups classified by the two cancer types, AC and SC, using cutoff values of Q < 0.01 and |fold change (FC)| ≥ 1.5 (see Methods). As a result, 150 DEGs were identified between 58 AC-normal and 36 AC-COPD samples (named AC-DEGs), and 58 DEGs were identified between 33 SC-normal and 62 SC-COPD samples (named SC-DEGs).
We conducted heatmap analysis accompanied by unsupervised hierarchical clustering for AC and SC to investigate whether these DEGs demarcated the normal and COPD samples (Figs. 2A and 2D). The classification of samples by the DEGs was not perfect in either cancer types (i.e., some normal samples are clustered in an intermingled way with some COPD samples, and vice versa), leading to the recognition of a third group of patient samples, separate from normal and COPD.
We thus performed
Interestingly, the median FEV1/FVC ratio of group ‘I’ was approximately equivalent to group ‘N’, although the ratio ranged widely from less than 50% (clinical COPD) to greater than 80% (clinically normal) for both AC and SC (Figs. 2C and 2F). Based on this result, group ‘I’ may contain misclassified patient samples, although they had been clinically diagnosed with or without COPD.
After establishing a third group of patients, group ‘I’, we decided to re-identify the three sets of DEGs to determine how the gene expression patterns of the patients in group ‘I’ e differed. We defined NC-DEGs, NI-DEGs, IC-DEGs by comparing gene expression between groups ‘N’ and ‘C’, ‘N’ and ‘I’, and ‘I’ and ‘C’, respectively. Several statistical cutoff values for defining the three sets of DEGs were tested, from which we found that very few NI- and IC-DEGs remained when Q < 0.01 and | FC| ≥ 2 were applied. However, 237 NC-DEGs still remained as DEGs at those stringent cutoffs (
Next, we tried to obtain common COPD-DEGs between AC-COPD and SC-COPD groups by collecting common NC-DEGs identified based on the intersection between the NC-DEG sets of AC and SC. As shown in Fig. 3, 237 common NC-DEGs (the bottom panel), including 131 upregulated and 106 downregulated genes, differentiated COPD from normal samples without ambiguity for both AC (the top left panel) and SC (the top right panel) using a heatmap accompanied by unsupervised hierarchical clustering. Notably, samples from group ‘I’ were excluded from this analysis of both AC and SC.
The intersection between NC-DEGs for AC and SC produced two additional groups of gene sets, ‘AC-specific’ (316 genes) and ‘SC-specific’ (62 genes), as well as the common NC-DEGs (Fig. 4). Moreover, according to the direction of the changes in expression (up- or downregulation), these three categories of NC-DEGs were divided into six categories: ‘Common-up’ (131 genes), ‘Common-down’ (106 genes), ‘AC-specific-up’ (140 genes), ‘AC-specific-down’ (176 genes), ‘SC-specific-up’ (29 genes), and ‘SC-specific-down’ (33 genes)(Fig. 4A).
We performed a GO analysis of each of the 6 categories of DEGs, confirming several previously well-known COPD-related genes in the ‘common’ category. For instance, GO functions such as ‘inflammatory response’, ‘cell adhesion’, and ‘ECM’ were significantly enriched in the ‘Common-up’ category, whereas mitochondria-related functions were significantly enriched in the ‘Common-down’ category (Fig. 4A). Interestingly, all these functional GO terms were completely consistent with the findings from recent reviews aiming to decipher the molecular pathology of COPD (Chen et al., 2008; Wang et al., 2008). Similar GO functional terms appeared in both AC-specific and SC-specific categories, although the numbers of genes associated with specific functional terms differed, suggesting that genes involved in molecular pathways associated with COPD etiology were commonly altered as well. Notably, however, inflammatory and apoptosis genes were prominent functional terms for the AC-specific category, but not the SC-specific category, whereas cell proliferation genes showed the opposite trend (Fig. 4A). We postulate that this difference in functional GO terms exists because the etiology of COPD in patients with AC and SC is not identical due to different genetic and epigenetic conditions associated with each cancer type. Notably, the GSEA generally provided the same interpretation as the GO analysis (data not shown).
We then created a functional interaction network with the ReactomeFI Cytoscape plugin (see the Methods) using these 615 NC-DEGs assigned into the six different categories (Fig. 4B), confirming that these genes, regardless of the categories to which they are assigned, are associated with COPD. Most of the input DEGs were inter-connected in a large network and the functional terms of each clustered subgroup in the network were consistent with the functions mentioned above, suggesting that these DEGs may participate in the pathogenic pathway leading to COPD. The integration of an additional source of functional information about these DEGs, i.e., previously known COPD-associated genes, including
We next attempted to further characterize the 237 common NC-DEGs by integrating them with other multi-omics data. Previously, Kim and Lee (2017) reported a total of 45 miRNAs to be differentially expressed in COPD patients (i.e., DE-miRNAs) compared with normal controls. It is hypothesized that the alteration of miRNA expression could lead to alterations in target gene expression, considering that miRNAs are well-established regulators of gene expression. Consequently, we searched the 237 common NC-DEGs for genes predicted to be targeted by these 45 DE-miRNAs by examining the intersection of these DEGs with previously known miRNA-target mRNA pairs (see the Methods). As a result, 43 of the 45 miRNAs were associated with a total of 4,786 mRNA targets. By overlapping the 237 DEGs with the 4,786 miRNA targets, 66 NC-DEGs were identified (Tables 1 and
Another mechanism of gene expression alteration is genetic mutations that occur in
We then examined whether a general pattern occurred in the mechanism of regulatory alterations, mediated either by miRNAs or by eQTLs, in functional categories into which the common NC-DEGs were assigned. Two groups of DEGs, i.e., DEGs targeted by the 45 DE-miRNAs and DEGs mapped to eQTLs, were overlapped, resulting in three categories of DEGs: genes whose expression was altered only by DE-miRNAs (designated ‘miRNA-specific’), genes whose expression was altered only by regulatory mutations (designated ‘eQTL-specific’), and whose expression was altered by both mechanisms (designated ‘both’). As shown in Table 1, only 20 NC-DEGs, including
Here, we identified genes that were significantly altered by COPD using samples previously published by Kim et al. (2015a). We not only confirmed the previous finding, showing the perturbation in the gene expression of inflammatory genes and mitochondrial genes, but also revealed a novel aspect regarding the changes in putative regulatory regions that might affect changes in COPD-DEG expression.
A problem confronted by Kim et al. in their study (Kim et al., 2015) and by us in the present study was that all COPD samples were accompanied by lung cancers, either AC or SC. However, a substantial difference between the study by Kim et al. (2015) and our present study is the strategy used to identify DEGs; we attempted to remove the bias driven by lung cancers in patients with COPD by detecting COPD-DEGs. Another difference was the recognition of an ‘intermediate’ group named group ‘I’, i.e., patients who cannot be classified as COPD or non-COPD by gene expression patterns alone. We observed few differences in gene expression levels between the ‘N’ and ‘I’ groups and between the ‘I’ and ‘C’ groups, and the FEV1/FVC ratio of group ‘I’ was scattered from low to high, indicating that COPD and non-COPD samples were likely intermingled within group ‘I’. In other words, the clinical diagnosis of patients in group ‘I’ as either COPD or normal might have been inaccurate, potentially leading to the identification of less reliable COPD-DEGs if group ‘I’ samples were included in the datasets. We postulate that the exclusion of samples from group ‘I’ and the collection of NC-DEGs helped us unambiguously identify reliable COPD-DEGs. In fact, the subsequent analyses performed after DEG identification, including the GO analysis and functional interaction network analysis, confirmed the previous findings from COPD-driven transcriptome analyses, as the expression of inflammatory genes and apoptosis genes was upregulated, and the expression of oxidation-reduction genes was downregulated. A third difference exists with regard to our effort to integrate DEGs with various omics datasets to investigate a mechanistic question underlying the expression alteration occurring in COPD. The integration of multi-omics data, including GDA, eQTLs, and miRNAs, provided us an opportunity to link changes in upstream regulatory regions and changes in downstream gene expression, through which we identified some common pathways involved in the development of COPD, regardless of co-occurring cancer types. Additionally, different functional classes of genes were altered by different lung cancer backgrounds or by the upregulation or downregulation of genes.
Finally, we further validated alterations in the expression of some selected genes in the two mouse COPD model systems, in which
We think that our present study will contribute to understanding the molecular etiology of COPD coupled with lung cancers and to identifying diagnostic markers of COPD.
This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF-2016R1D1A1B03930411).
. Results of the GO analysis.
Category (n) | Gene Ontology | |
---|---|---|
miRNA specific (46) | positive regulation of smooth muscle cell proliferation | 4.26E-07 |
positive regulation of transcription from RNA polymerase II promoter | 2.61E-05 | |
positive regulation of cell proliferation | 1.64E-04 | |
cellular response to tumor necrosis factor | 2.79E-03 | |
cell adhesion | 6.11E-03 | |
Both (20) | positive regulation of ERK1 and ERK2 cascade | 9.54E-04 |
negative regulation of apoptotic process | 1.49E-03 | |
positive regulation of fibroblast proliferation | 1.68E-03 | |
positive regulation of MAP kinase activity | 2.00E-03 | |
leukocyte migration | 8.26E-03 | |
eQTL specific (76) | respiratory electron transport chain | 2.00E-03 |
lymphocyte homeostasis | 1.66E-02 | |
positive regulation of intrinsic apoptotic signaling pathway in response to DNA damage | 1.98E-02 | |
inflammatory response | 3.74E-02 | |
protein phosphorylation | 6.53E-02 |
Minji Lee, Yeonhwa Song, Inhee Choi, Su-Yeon Lee, Sanghwa Kim, Se-Hyuk Kim, Jiho Kim, and Haeng Ran Seo
Mol. Cells 2021; 44(1): 50-62 https://doi.org/10.14348/molcells.2020.0212You-Soub Lee, Ja-Yeol Lee, Soo-Hyun Song, Da-Mi Kim, Jung-Won Lee, Xin-Zi Chi, Yoshiaki Ito, and Suk-Chul Bae
Mol. Cells 2020; 43(10): 889-897 https://doi.org/10.14348/molcells.2020.0182Dohun Kim, You-Soub Lee, Duk-Hwan Kim, and Suk-Chul Bae
Mol. Cells 2020; 43(1): 1-9 https://doi.org/10.14348/molcells.2020.2246