Mol. Cells 2021; 44(7): 433-443
Published online July 9, 2021
https://doi.org/10.14348/molcells.2021.0042
© The Korean Society for Molecular and Cellular Biology
Correspondence to : joonan30@korea.ac.kr
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.
Multi-omics approaches are novel frameworks that integrate multiple omics datasets generated from the same patients to better understand the molecular and clinical features of cancers. A wide range of emerging omics and multi-view clustering algorithms now provide unprecedented opportunities to further classify cancers into subtypes, improve the survival prediction and therapeutic outcome of these subtypes, and understand key pathophysiological processes through different molecular layers. In this review, we overview the concept and rationale of multi-omics approaches in cancer research. We also introduce recent advances in the development of multi-omics algorithms and integration methods for multiple-layered datasets from cancer patients. Finally, we summarize the latest findings from large-scale multi-omics studies of various cancers and their implications for patient subtyping and drug development.
Keywords cancer research, genomics, multi-omics approach, proteogenomics, proteomics, systems biology
Living organisms experience millions of signals transferred every second between cells, tissues, organs, and external environmental stimuli. Fine-tuned responses at various degrees and scales within the human body are central to the homeostatic mechanism that copes with potentially harmful environmental perturbations, including pathogens, smoking, and drugs, and interacts with the genetic background arising from spontaneous somatic mutations and numerous germline variants. Thus, a holistic view of homeostatic mechanisms through the study of genomic and epigenetic aberrations is needed to understand the core of cancer biology and the pathophysiological features of cancer during oncogenesis and tumor progression.
A multi-omics study is a data-driven scientific investigation that analyzes a range of high-dimensional datasets at multiple levels and scales to reveal the complexity of cells and their environment. Such type of study can provide novel frameworks to untangle biological phenomena or models to test certain hypotheses using various datasets. In cancer research, a paradigm shift toward multi-omics approaches has been achieved with the recent development of high-throughput technologies in genomics and transcriptomics, increasing effort in large-scale research collaboration, and advancement of computational algorithms (Basu et al., 2013; Berns and Bernards, 2012; Cancer Genome Atlas Network, 2012b; Gentles and Gallahan, 2011; Whitehurst et al., 2007). Together with advances in genomics and transcriptomics, proteomics is emerging as a prominent field to elucidate the dynamics of gene activity. Large-scale proteomic research, such as that promoted by the Clinical Proteomic Tumor Analysis Consortium (CPTAC), has uncovered the ubiquitous link of biomolecules to the environment and disease status (Gillette et al., 2020; Krug et al., 2020; Mertins et al., 2016; Mun et al., 2019; Zhang et al., 2016). Such a transition has extensively deepened our knowledge on the function of driver genes and proteins and has provided a comprehensive understanding of the signaling networks occurring between cells, tissues, organs, and the entire organism. Multi-omics approaches have been applied to numerous clinical studies for better identification of clinical subtypes or drug resistance, prediction of effective combination therapies, and identification of predictive biomarkers to increase the response rate to targeted treatments.
In this review, we introduce the concept of multi-omics approaches in cancer research and provide useful resources for this. We focus on some of the clinical and basic science studies that have benefited from the use of a multi-omics approach to uncover novel concepts and properties. We also discuss some of the challenges connected to multi-omics approaches and how this relatively young field of study can have a positive impact on cancer research.
Over the past decades, there have been rapid advances in high-throughput technologies, which enable a range of genomic analyses at the cellular and tissue levels. Furthermore, highly developed genome screening technologies, such as whole exome sequencing (WES) and whole genome sequencing (WGS), have enabled comprehensive collection of gene expression data (e.g., RNA sequencing [RNA-seq] and microRNA [miRNA] profiling) and DNA methylation profiles (Cancer Genome Atlas Network, 2012a; 2012b; Cancer Genome Atlas Research Network, 2011, 2013; Cancer Genome Atlas Research Network et al., 2013a; Chin et al., 2006; Hennessy et al., 2010; Neve et al., 2006). Single-cell technologies provide new biological insights for the understanding of gene activity and cytological characteristics at the cellular level (Lee et al., 2021; Stuart et al., 2019; Stuart and Satija, 2019). In addition, large amounts of proteins and metabolites can be detected with high accuracy owing to the maturation of mass spectrometry techniques (Lai et al., 2018; Palmer et al., 2017; Schubert et al., 2017). Proteomics technologies allow to detect almost all human proteins and are advancing toward single-cell resolution (Marx, 2019; Vidova and Spacil, 2017). However, a single platform is insufficient to decipher the complexity underlying cancer genomes or to find a robust association with cancer driver mutations (Bozic et al., 2010; Greenman et al., 2007). Consequently, there is an emerging effort in the development of data-driven mathematical and computational methods to analyze high-dimensional datasets obtained from several novel analysis platforms (Bodenmiller et al., 2012; Hill et al., 2012; Pritchard et al., 2013; Qiu et al., 2011; Sumazin et al., 2011; Tentner et al., 2012; Teves and Won, 2020).
In this regard, multi-omics approaches have been introduced to integrate multiple omics datasets generated from patients and identify coherent and preserved molecular or clinical features across different datasets (Fig. 1). Multi-omics studies aim to identify patient subgroups and biological features underlying cancer pathophysiology; they have been applied to overcome current complexities, due to genetic and phenotypic heterogeneity, that hinder our understanding of cancer genesis and progression, and to design effective predictive models to validate novel therapies and drugs. Within such an integrative framework, there has been an emerging effort to develop computational and mathematical methods that can decipher the complexity of cancer heterogeneity, since genomic and epigenetic instability in tumors can alter intracellular responses to the local environment and affect the individual as a whole through the tumorigenic process.
Over the last decade, a range of modeling approaches have been developed to deal with various aspects of cancer. In particular, the integration of large omics datasets has enabled modeling of cellular behaviors at the tissue level to understand cancer pathophysiology or the behavior of cancer cells in response to drugs and angiogenesis (Carro et al., 2010; Hong et al., 2020; Huang et al., 2013; Iadevaia et al., 2010; Pascal et al., 2013; Swanson et al., 2011). Multi-omics studies have opened new avenues for the implementation of targeted therapies for cancer treatment. Integrative approaches with large-scale multi-omics datasets have the potential to delineate the relationship between molecular markers and the response to targeted therapies. A more comprehensive understanding of the molecular characteristics of non-responsive or resistant tumors could enable more precise predictions of therapy outcomes, resulting in an increased therapeutic efficacy or in the ability to bypass drug resistance. In addition, multi-omics approaches might allow to identify subgroups of patients that are most likely to benefit from therapy.
Cancer cells exhibit extreme levels of genetic heterogeneity and genomic instability. Thus, many putative driver aberrations can be observed: some could be
Recent advances in high-throughput sequencing technologies have allowed the measurement of a large number of molecular patterns of cancer in a single experiment. High-throughput measurements enable rapid and unbiased profiling of somatic mutations, copy number variations (CNVs), and mRNA, non-coding RNA, and protein expression. Various computational algorithms have been proposed for multi-view clustering, to detect coherent features from heterogeneous inputs. In the biomedical domain, this has facilitated the definition of the clinical subtypes of complex disorders, such as cancers. Clustering methods have been widely developed to identify co-expressed gene modules and subgroups of patients within a certain disease (Langfelder and Horvath, 2008). The integration of multi-omics datasets for the same set of samples has been devised to better understand fine-tuned structures, which are not revealed by examining only a single data type. For instance, cancer subtypes can be classified based on multi-omics datasets, such as gene expression and mutation profiles, from the same patients (Chauvel et al., 2020). Multi-omics clustering can ameliorate potential bias or noise from a single omics dataset as the integration of multiple omics layers can fully represent different cellular aspects from the genomic to the epigenomic level (Nguyen and Wang, 2020; Wang et al., 2014).
To date, various tools have been developed for multi-omics datasets with the following objectives: 1) identify disease subtypes or classify subgroups, 2) identify putative biomarkers for diagnostics and driver genes for diseases, and 3) gain insights into disease biology. Multi-omics frameworks are mostly based on Bayesian statistics (Kirk et al., 2012; Lock and Dunson, 2013; Shen et al., 2009; Vaske et al., 2010; Wu et al., 2015; Yuan et al., 2011), similarity networks (Nguyen et al., 2019; Wang et al., 2014), joint nonnegative matrix factorization (Yang and Michailidis, 2016), and sparse canonical correlation analysis (Witten and Tibshirani, 2009). Several multi-omics tools are highly used in the field or show outperformance for subtype prediction and survival analysis (Table 1). However, most multi-omics tools rely on different mathematical theories and support different ranges of data types. Even when using the same data, their performance varies greatly depending on the biological characteristics of the study objects. Therefore, acquiring biological insights from multi-omics data is a computational and biological challenge, requiring the researcher to select appropriate multi-omics tools.
iCluster is an early multi-omics integration method that first integrates multiple inputs and then identifies multi-omics clusters by joint estimation of latent variables and through clustering and expectation–maximization-like algorithms (Shen et al., 2009). It was initially used for large-scale cancer genomic projects, for example for breast and lung cancer, in which gene expression and CNVs were summarized for multiple subgroups of patients. Since the runtime of iCluster increases with the number of features, iCluster+, providing full Bayesian regularization for clustering, has recently been proposed (Mo et al., 2013). iCluster+ identified colorectal cancer subtypes with different cancer progression pathways, one of which was found not to require aggressive drug treatment in addition to surgery.
iOmicsPASS is a network-based algorithm that can merge genome-based networks with multi-omics datasets (Koh et al., 2019). Scores for biological interaction are computed by transformation of omics datasets and used as an input to construct networks, whose edges are defined for phenotypic groups using a modified nearest shrunken centroid algorithm. iOmicsPASS was shown to improve the identification of breast invasive ductal carcinoma (IDC) subtypes by integrating mRNA expression and protein abundance data. Such integrated analysis by iOmicsPASS revealed a new transcriptional regulatory network in a specific breast cancer subtype that could not be found through single-omics analysis.
SALMON is a deep learning method based on co-expression networks (Huang et al., 2019). It takes multi-omics datasets from cancer patients and computes eigengenes from co-expression modules, and can thus ameliorate the issue of overfitting arising whenever multi-omics approaches are applied to datasets containing many features but few samples are available. For example, by analyzing mRNA and miRNA datasets from 583 female breast invasive carcinoma patients, SALMON provided a good prediction of survival.
SNF is a novel algorithm for the generation of patient similarity networks that uses an iterative procedure based on message passing (Wang et al., 2014). It calculates similarity networks for individual patients and then merges them to identify disease subtypes and predict phenotypes. In contrast to early integration, SNF takes advantage of individual omics datasets to construct independent single-omics networks and find coherent modules sourced from similar biological features across patients with similar clinical features. SNF iteratively applies a local K-nearest neighbors (KNN) approach to compute a patient similarity matrix for each omics dataset. When merging the global similarity matrices from all omics datasets, SNF conducts averaging of similarity matrices with iterative updating. It has demonstrated high efficiency in identifying clinical subtypes of cancers and other disorders such as autism (Cavalli et al., 2017; Ramaswami et al., 2020).
NEMO is a multi-omics clustering method that can be used for partial datasets without the need for data imputation (Rappoport and Shamir, 2019). NEMO first calculates an inter-patient similarity matrix for each omics dataset and then combines the matrices of different omics datasets into a single matrix. Clusters are identified using an adjusted Rand index to compute the similarity between patients by distance. NEMO was shown to outperform other multi-omics clustering algorithms when tested on multi-omics datasets of 10 cancers, and exhibited enhanced cluster detection from partial datasets.
MONET is a method for detecting similar modules commonly present across multi-omics datasets (Rappoport et al., 2020). MONET utilizes three omics datasets (mRNA expression, DNA methylation, and miRNA expression) to compute an edge-weighted graph per omics dataset, where nodes represent samples and edges represent the similarity between samples. It then detects a disjoint set of modules for patients from multiple omics graphs. MONET was used to conduct benchmarking on 287 patients with ovarian serous cystadenocarcinoma, and revealed four sample modules representing venous invasion status and survival rates.
PARADIGM is a method to identify specific biological pathways from a multi-omics dataset (Vaske et al., 2010). It combines multi-omics-scale values derived from an individual sample with gene activities, products, and an overview of the pathway interactions included in the National Cancer Institute (NCI) database, which contains information on protein-protein interactions. PARADIGM utilizes factor graphs derived from variables representing the state of various entities (e.g., a specific mRNA molecule or protein complex), and then creates probabilistic graphical models. Using these, it infers significant and non-significant interactions between pathways involving different entities. This tool proved to be efficient, and revealed four subtypes of glioblastoma leading to significantly different survival outcomes according to the perturbated pathways. This result suggests that the cancer subtype could be used as a basis to support clinical decisions.
LRAcluster is a multi-omics approach that integrates data on somatic mutations, CNVs, DNA methylation, and gene expression, and performs low-rank approximation from the probabilistic models of various molecular features (Wu et al., 2015). All molecular features from the omics datasets are transformed into variables and arranged in a parameter matrix, which is subject to the low-rank assumption. Next, dimension reduction is conducted, revealing clusters associated with distinct clinical subtypes. LRAcluster outperformed other existing methods in terms of both time and classification accuracy when tested on multi-omics datasets of breast invasive carcinoma, colon adenocarcinoma, and lung adenocarcinoma (LUAD).
BCC is a data-driven approach that performs consensus clustering across multi-omics datasets (Lock and Dunson, 2013). BCC is based on the finite Dirichlet mixture model to explain not only overall consensus clustering, but also important features inherent to an individual omics dataset. Given that clusters constructed using a single data type are roughly connected, BCC seeks an integrative point for their adherence to an overall cluster. BCC was applied to 384 breast cancer patients from TCGA datasets, including gene expression, DNA methylation, and protein data, and effectively revealed three cancer subtypes associated with specific clinical features.
Cancer research has taken advantage of advances in omics technologies from genomics to transcriptomics and of the wide range of resources of multiple omics datasets originating from the same patients. Multi-omics approaches provide a unique opportunity to identify the molecular and clinical features of cancer patients. In genomics and transcriptomics, there is an unmet need to disentangle incompatibility in related biological processes, such as differences in post-translational modifications or variability in expression profiles due to the role of mRNA transcripts in cancer development (Greenbaum et al., 2003; Hegde et al., 2003; Tyers and Mann, 2003). Recent advances in proteomics through the maturation of several mass spectrometry techniques have enabled the introduction of proteogenomic approaches, which can integrate genomic data with proteomics and information on post-translational modifications (e.g., protein phosphorylation and acetylation). Large-scale proteogenomic research, including that promoted by the CPTAC (Gillette et al., 2020; Krug et al., 2020; Mertins et al., 2016; Mun et al., 2019; Zhang et al., 2016), has been conducted to unravel new biological mechanisms in cancers and provide fundamental information on multi-omics approaches for the development of integration strategies or computational algorithms.
Multi-omics clustering further refined the association between molecular profiles and clinical features among cancer patients (Fig. 2). The identification of coherent subtypes across multiple dataset layers could have major implications for predicting clinical relevance or therapeutic response regardless of the overall tumor mutational load. Moreover, the integration of proteomics datasets enables the identification of a direct connection between mutations and phenotypes, and therefore increases the resolution of clustering patterns across samples. Here, we summarize the latest findings obtained in cancer research using multi-omics approaches.
Despite extensive research on its mutation signature and gene expression landscape, LUAD shows a high level of intrinsic or acquired resistance after treatment. Therefore, recent multi-omics-based efforts have been made to integrate genomic, transcriptomic, and proteomic datasets and decipher the molecular features underlying durable treatment responses.
Recently, the CPTAC has conducted a large-scale multi-omics study of LUAD by integrating WES, WGS, RNA-seq, miRNA and DNA methylation profiling, and high-resolution mass spectrometry-based proteomics, phosphoproteomics, and acetylproteomics. Integrative multi-omics clustering revealed four clusters of clinical and molecular features. For example, the patients in Cluster 1 were mostly
In another large-scale study, Chen et al. (2020) applied multi-omics approaches for early-stage, non-smoker patients in Taiwan using WES, RNA-seq, and proteomics datasets (Chen et al., 2020). Clustering was performed separately for proteomics, transcriptomics, and phosphoproteomics datasets, and clustering of proteomics data into three subtypes was chosen as the best representative of tumor staging and driver mutation classification. The largest group, Subtype 1, was composed of late-stage tumors (> II) with a high mutation rate, including in
Multi-omics analyses have increased our knowledge of breast cancer biology. In particular, integrative analyses have revealed the recurrence of mutations in the
An integrative analysis of gene expression and proteomics has been applied to the survival data of
A recent study on 122 patients integrating data on mutations, mRNA expression, protein expression, and post-translational modifications (phosphorylation and acetylation) has yielded robust profiles to elucidate the biological features of breast cancer (Krug et al., 2020). The resulting subtypes, that is, the basal-inclusive, HER2-inclusive, LumA-inclusive, and LumB-inclusive subtypes, were similar to those generated by the already existing and widely used PAM50 assay but revealed hidden biological structures such as the status of the
Multi-omics research on gastric cancers revealed four subtypes: 1) an Epstein–Barr virus subtype with recurrent
In highly characterized samples of glioblastoma patients, a multi-omics approach has delineated core transcriptional factors (CEBP and STAT3) that widely regulate mesenchymal transformation in glioblastoma (Carro et al., 2010). Integrative analyses of gene expression and phosphoproteomes have identified several cellular features that respond to stress and growth factors (Hill et al., 2012; Huang et al., 2013), are key regulators of the EGFR signaling pathway, and are associated with patient survival outcomes (Amit et al., 2007). Similarly, combining proteomic and metabolomic profiles also revealed a unique regulatory function in a cellular network of stress and growth factors (Bordbar et al., 2012). Dekker et al. (2020) conducted an integrative multi-omics analysis of gene and protein expression, as well as phosphoproteomic profiles, using paired primary recurrent tissue samples from eight glioblastoma patients (Dekker et al., 2020). Half of the patients showed a marked difference in the phosphorylation of STMN1 (S38), a component of the ERBB4 signaling pathway.
Integrating methylation profiles with genomic and transcriptomic datasets can substantiate the utility of studying acute myeloid leukemia (AML). A multi-omics analysis of 200 adult patients with AML showed distinct gene expression and methylation patterns across samples (Cancer Genome Atlas Research Network et al., 2013b). In particular, CpG-sparse regions showed a marked difference in methylation due to gene mutations. AML cells with
A multi-omics approach has also been applied to pancreatic ductal adenocarcinoma (PDAC) by integrating omics profiling of 150 patients for mutations, gene expression (mRNA, miRNA, and long non-coding RNA [lncRNA]), DNA methylation, and protein expression (Cancer Genome Atlas Research Network, 2017).
Drug target discovery is a critical step in the development of cancer drugs and personalized therapeutics. In traditional drug target discovery, biomolecules with a confirmed mechanism of action are selected through a series of studies, which require enormous manpower (Lindsay, 2003; Paananen and Fortino, 2020). Over the last decade, putative drug targets have been identified through the latest high-throughput genomic approaches in combination with experimental validation, including overexpression or knockdown by RNAi and the use of transgenic animals and model organisms (Benson et al., 2006). Multi-omics is an interdisciplinary approach to study biological characteristics, and can comprehensively yield many drug target candidates in a cost-effective manner. The analysis of 14 cancer subtypes from TCGA multi-omics datasets revealed 40 driver genes associated with the Wnt, Notch, Hedgehog, JAK/STAT, NK-KB, and MAPK signaling pathways (Chen et al., 2014). Among them, well-known driver genes such as
Multi-omics approaches may allow systematic assessment of drug discovery for personalized cancer therapy and improve the efficacy of chemotherapy (Aguirre et al., 2018; Li et al., 2013; Pauli et al., 2017). Refining molecular-defined subsets of patients can provide information on drug response and resistance, which vary among patients. Cui et al. (2020) integrated the expression of lncRNA, miRNA, mRNA, methylation, and the profile of somatic mutations with the expression of drug response-related lncRNAs. These authors found that lncRNAs respond to diverse chemotherapeutic drugs and characterized some key lncRNAs, such as HOXA-AS2, which mediate resistance to the drug adriamycin in BRCA patients (Cui et al., 2020). Another proteogenomic study of breast cancer found that triple-negative BRCA (TNBC) tumors with
In this review, we introduce computational methods for multi-omics studies and report the latest findings in cancer research based on them. Multi-omics approaches can fully characterize the intersection between different layers of quantitative information, systematically summarizing biological interactions from an individual cell or tissue to an individual patient with a primary tumor and possible metastases. In addition, such integration can reflect the molecular characteristics of tumors at various levels, from genes to proteins, and different cancer stages through multidisciplinary analysis.
Multi-omics approaches may hold the potential to study different cancer types with a high level of similarity, in terms of molecular characteristics, to basal-like breast cancer, high-grade serous ovarian cancer, and serous endometrial cancer (Cancer Genome Atlas Research Network et al., 2013a). A systems approach integrating multi-omics data is key to understanding cancer biology and investigating the molecular pathogenesis of cancer. Multi-omics data analysis across tumor types can identify molecular characteristics commonly underlying a range of cancer types and further detail patient subgroups as well as the molecular classification of cancer subtypes.
Therefore, multiple data layers, including genomics, transcriptomics, epigenomics, and proteomics datasets, are required to fully represent the molecular and clinical structures of cancer patients. The generation of high-quality and unbiased datasets is a critical part of multi-omics approaches. In addition, further studies should consider proper integration methods and computational algorithms for robust and systematic assessment to obtain solid findings and predictive models.
This work was supported by the Korean NRF Grant 2019M3E5D3073568 (to J.Y.A.) and a Korea University Grant.
Y.J.H. and J.Y.A. wrote the original draft. Y.J.H., C.H., G.H.L., J.M.P., and J.Y.A. reviewed and edited the manuscript. Y.J.H., C.H., and J.Y.A. provided a figure and table.
The authors have no potential conflicts of interest to disclose.
List of computational frameworks for multi-omics cancer studies
Study | Findings | Dataset | Principles |
---|---|---|---|
iCluster (Curtis et al., 2012; Shen et al., 2009) | Novel subgroups from 2,000 breast tumors | mRNA expression CNV | Joint latent variable model-based clustering method |
iOmicsPASS (Koh et al., 2019) | Novel transcriptional regulatory network from TCGA/CPTAC breast cancer data | mRNA expression CNV Protein expression | Network construction using a modified nearest shrunken centroid algorithm |
SALMON (Huang et al., 2019) | Improved survival analysis | Mutation mRNA/miRNA expression CNV | Deep learning based on co-expression modules |
SNF (Wang et al., 2014) | Subtype classification of clinical relevance | mRNA DNA methylation | Patient similarity networks using an iterative procedure based on message passing |
NEMO (Rappoport and Shamir, 2019) | Novel subtypes from even partial AML datasets | mRNA DNA methylation | Sample clustering from partial datasets using an adjusted Rand index |
MONET (Rappoport et al., 2020) | Module detection of patient subtypes and improved survival analysis | mRNA DNA methylation | Detect similar modules commonly present across multi-omics datasets |
PARADIGM (Vaske et al., 2010) | Detection of pathways affected by cancer with fewer false positives | mRNA expression CNV | Pathway recognition algorithm applied to multi-omics datasets |
LRAcluster (Wu et al., 2015) | Subtype detection in both pan-cancer analysis and single cancer types | Mutation mRNA expression CNV DNA methylation | Performance of low-rank approximation from probabilistic models |
BCC (Lock and Dunson, 2013) | Detection of patient subtypes in response to survival rates and driver mutation signatures | mRNA DNA methylation Protein expression | Bayesian framework for estimation of an integrative clustering model |
aGene expression data with normalization (e.g., quantile normalization, fragment per kilobase of transcript per million mapped reads [FPKM]).
bQuantification of miRNA expression.
cCircular binary segmentation-based copy number segmented means.
dAffymetrix 6.0 SNP arrays.
eProtein quantification by iTRAQ (isobaric Tags for Relative and Absolute Quantification) protein quantification.
fReverse phase protein array (RPPA).
gIllumina Human Methylation arrays.
hIn the SALMON method, the copy number burden (CNB) is calculated using the total gene length (Kb) from SNP 6 data, and the tumor mutation burden (TMB) is calculated using the total number of mutated genes reported in Mutation Annotation Format (MAF) files.
iThe LRAcluster method uses somatic mutation data converted into a binary form.
Mol. Cells 2021; 44(7): 433-443
Published online July 31, 2021 https://doi.org/10.14348/molcells.2021.0042
Copyright © The Korean Society for Molecular and Cellular Biology.
Yong Jin Heo1,2 , Chanwoong Hwa1
, Gang-Hee Lee1
, Jae-Min Park1
, and Joon-Yong An1,2,*
1School of Biosystem and Biomedical Science, College of Health Science, Korea University, Seoul 02841, Korea, 2Department of Integrated Biomedical and Life Science, Korea University, Seoul 02841, Korea
Correspondence to:joonan30@korea.ac.kr
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.
Multi-omics approaches are novel frameworks that integrate multiple omics datasets generated from the same patients to better understand the molecular and clinical features of cancers. A wide range of emerging omics and multi-view clustering algorithms now provide unprecedented opportunities to further classify cancers into subtypes, improve the survival prediction and therapeutic outcome of these subtypes, and understand key pathophysiological processes through different molecular layers. In this review, we overview the concept and rationale of multi-omics approaches in cancer research. We also introduce recent advances in the development of multi-omics algorithms and integration methods for multiple-layered datasets from cancer patients. Finally, we summarize the latest findings from large-scale multi-omics studies of various cancers and their implications for patient subtyping and drug development.
Keywords: cancer research, genomics, multi-omics approach, proteogenomics, proteomics, systems biology
Living organisms experience millions of signals transferred every second between cells, tissues, organs, and external environmental stimuli. Fine-tuned responses at various degrees and scales within the human body are central to the homeostatic mechanism that copes with potentially harmful environmental perturbations, including pathogens, smoking, and drugs, and interacts with the genetic background arising from spontaneous somatic mutations and numerous germline variants. Thus, a holistic view of homeostatic mechanisms through the study of genomic and epigenetic aberrations is needed to understand the core of cancer biology and the pathophysiological features of cancer during oncogenesis and tumor progression.
A multi-omics study is a data-driven scientific investigation that analyzes a range of high-dimensional datasets at multiple levels and scales to reveal the complexity of cells and their environment. Such type of study can provide novel frameworks to untangle biological phenomena or models to test certain hypotheses using various datasets. In cancer research, a paradigm shift toward multi-omics approaches has been achieved with the recent development of high-throughput technologies in genomics and transcriptomics, increasing effort in large-scale research collaboration, and advancement of computational algorithms (Basu et al., 2013; Berns and Bernards, 2012; Cancer Genome Atlas Network, 2012b; Gentles and Gallahan, 2011; Whitehurst et al., 2007). Together with advances in genomics and transcriptomics, proteomics is emerging as a prominent field to elucidate the dynamics of gene activity. Large-scale proteomic research, such as that promoted by the Clinical Proteomic Tumor Analysis Consortium (CPTAC), has uncovered the ubiquitous link of biomolecules to the environment and disease status (Gillette et al., 2020; Krug et al., 2020; Mertins et al., 2016; Mun et al., 2019; Zhang et al., 2016). Such a transition has extensively deepened our knowledge on the function of driver genes and proteins and has provided a comprehensive understanding of the signaling networks occurring between cells, tissues, organs, and the entire organism. Multi-omics approaches have been applied to numerous clinical studies for better identification of clinical subtypes or drug resistance, prediction of effective combination therapies, and identification of predictive biomarkers to increase the response rate to targeted treatments.
In this review, we introduce the concept of multi-omics approaches in cancer research and provide useful resources for this. We focus on some of the clinical and basic science studies that have benefited from the use of a multi-omics approach to uncover novel concepts and properties. We also discuss some of the challenges connected to multi-omics approaches and how this relatively young field of study can have a positive impact on cancer research.
Over the past decades, there have been rapid advances in high-throughput technologies, which enable a range of genomic analyses at the cellular and tissue levels. Furthermore, highly developed genome screening technologies, such as whole exome sequencing (WES) and whole genome sequencing (WGS), have enabled comprehensive collection of gene expression data (e.g., RNA sequencing [RNA-seq] and microRNA [miRNA] profiling) and DNA methylation profiles (Cancer Genome Atlas Network, 2012a; 2012b; Cancer Genome Atlas Research Network, 2011, 2013; Cancer Genome Atlas Research Network et al., 2013a; Chin et al., 2006; Hennessy et al., 2010; Neve et al., 2006). Single-cell technologies provide new biological insights for the understanding of gene activity and cytological characteristics at the cellular level (Lee et al., 2021; Stuart et al., 2019; Stuart and Satija, 2019). In addition, large amounts of proteins and metabolites can be detected with high accuracy owing to the maturation of mass spectrometry techniques (Lai et al., 2018; Palmer et al., 2017; Schubert et al., 2017). Proteomics technologies allow to detect almost all human proteins and are advancing toward single-cell resolution (Marx, 2019; Vidova and Spacil, 2017). However, a single platform is insufficient to decipher the complexity underlying cancer genomes or to find a robust association with cancer driver mutations (Bozic et al., 2010; Greenman et al., 2007). Consequently, there is an emerging effort in the development of data-driven mathematical and computational methods to analyze high-dimensional datasets obtained from several novel analysis platforms (Bodenmiller et al., 2012; Hill et al., 2012; Pritchard et al., 2013; Qiu et al., 2011; Sumazin et al., 2011; Tentner et al., 2012; Teves and Won, 2020).
In this regard, multi-omics approaches have been introduced to integrate multiple omics datasets generated from patients and identify coherent and preserved molecular or clinical features across different datasets (Fig. 1). Multi-omics studies aim to identify patient subgroups and biological features underlying cancer pathophysiology; they have been applied to overcome current complexities, due to genetic and phenotypic heterogeneity, that hinder our understanding of cancer genesis and progression, and to design effective predictive models to validate novel therapies and drugs. Within such an integrative framework, there has been an emerging effort to develop computational and mathematical methods that can decipher the complexity of cancer heterogeneity, since genomic and epigenetic instability in tumors can alter intracellular responses to the local environment and affect the individual as a whole through the tumorigenic process.
Over the last decade, a range of modeling approaches have been developed to deal with various aspects of cancer. In particular, the integration of large omics datasets has enabled modeling of cellular behaviors at the tissue level to understand cancer pathophysiology or the behavior of cancer cells in response to drugs and angiogenesis (Carro et al., 2010; Hong et al., 2020; Huang et al., 2013; Iadevaia et al., 2010; Pascal et al., 2013; Swanson et al., 2011). Multi-omics studies have opened new avenues for the implementation of targeted therapies for cancer treatment. Integrative approaches with large-scale multi-omics datasets have the potential to delineate the relationship between molecular markers and the response to targeted therapies. A more comprehensive understanding of the molecular characteristics of non-responsive or resistant tumors could enable more precise predictions of therapy outcomes, resulting in an increased therapeutic efficacy or in the ability to bypass drug resistance. In addition, multi-omics approaches might allow to identify subgroups of patients that are most likely to benefit from therapy.
Cancer cells exhibit extreme levels of genetic heterogeneity and genomic instability. Thus, many putative driver aberrations can be observed: some could be
Recent advances in high-throughput sequencing technologies have allowed the measurement of a large number of molecular patterns of cancer in a single experiment. High-throughput measurements enable rapid and unbiased profiling of somatic mutations, copy number variations (CNVs), and mRNA, non-coding RNA, and protein expression. Various computational algorithms have been proposed for multi-view clustering, to detect coherent features from heterogeneous inputs. In the biomedical domain, this has facilitated the definition of the clinical subtypes of complex disorders, such as cancers. Clustering methods have been widely developed to identify co-expressed gene modules and subgroups of patients within a certain disease (Langfelder and Horvath, 2008). The integration of multi-omics datasets for the same set of samples has been devised to better understand fine-tuned structures, which are not revealed by examining only a single data type. For instance, cancer subtypes can be classified based on multi-omics datasets, such as gene expression and mutation profiles, from the same patients (Chauvel et al., 2020). Multi-omics clustering can ameliorate potential bias or noise from a single omics dataset as the integration of multiple omics layers can fully represent different cellular aspects from the genomic to the epigenomic level (Nguyen and Wang, 2020; Wang et al., 2014).
To date, various tools have been developed for multi-omics datasets with the following objectives: 1) identify disease subtypes or classify subgroups, 2) identify putative biomarkers for diagnostics and driver genes for diseases, and 3) gain insights into disease biology. Multi-omics frameworks are mostly based on Bayesian statistics (Kirk et al., 2012; Lock and Dunson, 2013; Shen et al., 2009; Vaske et al., 2010; Wu et al., 2015; Yuan et al., 2011), similarity networks (Nguyen et al., 2019; Wang et al., 2014), joint nonnegative matrix factorization (Yang and Michailidis, 2016), and sparse canonical correlation analysis (Witten and Tibshirani, 2009). Several multi-omics tools are highly used in the field or show outperformance for subtype prediction and survival analysis (Table 1). However, most multi-omics tools rely on different mathematical theories and support different ranges of data types. Even when using the same data, their performance varies greatly depending on the biological characteristics of the study objects. Therefore, acquiring biological insights from multi-omics data is a computational and biological challenge, requiring the researcher to select appropriate multi-omics tools.
iCluster is an early multi-omics integration method that first integrates multiple inputs and then identifies multi-omics clusters by joint estimation of latent variables and through clustering and expectation–maximization-like algorithms (Shen et al., 2009). It was initially used for large-scale cancer genomic projects, for example for breast and lung cancer, in which gene expression and CNVs were summarized for multiple subgroups of patients. Since the runtime of iCluster increases with the number of features, iCluster+, providing full Bayesian regularization for clustering, has recently been proposed (Mo et al., 2013). iCluster+ identified colorectal cancer subtypes with different cancer progression pathways, one of which was found not to require aggressive drug treatment in addition to surgery.
iOmicsPASS is a network-based algorithm that can merge genome-based networks with multi-omics datasets (Koh et al., 2019). Scores for biological interaction are computed by transformation of omics datasets and used as an input to construct networks, whose edges are defined for phenotypic groups using a modified nearest shrunken centroid algorithm. iOmicsPASS was shown to improve the identification of breast invasive ductal carcinoma (IDC) subtypes by integrating mRNA expression and protein abundance data. Such integrated analysis by iOmicsPASS revealed a new transcriptional regulatory network in a specific breast cancer subtype that could not be found through single-omics analysis.
SALMON is a deep learning method based on co-expression networks (Huang et al., 2019). It takes multi-omics datasets from cancer patients and computes eigengenes from co-expression modules, and can thus ameliorate the issue of overfitting arising whenever multi-omics approaches are applied to datasets containing many features but few samples are available. For example, by analyzing mRNA and miRNA datasets from 583 female breast invasive carcinoma patients, SALMON provided a good prediction of survival.
SNF is a novel algorithm for the generation of patient similarity networks that uses an iterative procedure based on message passing (Wang et al., 2014). It calculates similarity networks for individual patients and then merges them to identify disease subtypes and predict phenotypes. In contrast to early integration, SNF takes advantage of individual omics datasets to construct independent single-omics networks and find coherent modules sourced from similar biological features across patients with similar clinical features. SNF iteratively applies a local K-nearest neighbors (KNN) approach to compute a patient similarity matrix for each omics dataset. When merging the global similarity matrices from all omics datasets, SNF conducts averaging of similarity matrices with iterative updating. It has demonstrated high efficiency in identifying clinical subtypes of cancers and other disorders such as autism (Cavalli et al., 2017; Ramaswami et al., 2020).
NEMO is a multi-omics clustering method that can be used for partial datasets without the need for data imputation (Rappoport and Shamir, 2019). NEMO first calculates an inter-patient similarity matrix for each omics dataset and then combines the matrices of different omics datasets into a single matrix. Clusters are identified using an adjusted Rand index to compute the similarity between patients by distance. NEMO was shown to outperform other multi-omics clustering algorithms when tested on multi-omics datasets of 10 cancers, and exhibited enhanced cluster detection from partial datasets.
MONET is a method for detecting similar modules commonly present across multi-omics datasets (Rappoport et al., 2020). MONET utilizes three omics datasets (mRNA expression, DNA methylation, and miRNA expression) to compute an edge-weighted graph per omics dataset, where nodes represent samples and edges represent the similarity between samples. It then detects a disjoint set of modules for patients from multiple omics graphs. MONET was used to conduct benchmarking on 287 patients with ovarian serous cystadenocarcinoma, and revealed four sample modules representing venous invasion status and survival rates.
PARADIGM is a method to identify specific biological pathways from a multi-omics dataset (Vaske et al., 2010). It combines multi-omics-scale values derived from an individual sample with gene activities, products, and an overview of the pathway interactions included in the National Cancer Institute (NCI) database, which contains information on protein-protein interactions. PARADIGM utilizes factor graphs derived from variables representing the state of various entities (e.g., a specific mRNA molecule or protein complex), and then creates probabilistic graphical models. Using these, it infers significant and non-significant interactions between pathways involving different entities. This tool proved to be efficient, and revealed four subtypes of glioblastoma leading to significantly different survival outcomes according to the perturbated pathways. This result suggests that the cancer subtype could be used as a basis to support clinical decisions.
LRAcluster is a multi-omics approach that integrates data on somatic mutations, CNVs, DNA methylation, and gene expression, and performs low-rank approximation from the probabilistic models of various molecular features (Wu et al., 2015). All molecular features from the omics datasets are transformed into variables and arranged in a parameter matrix, which is subject to the low-rank assumption. Next, dimension reduction is conducted, revealing clusters associated with distinct clinical subtypes. LRAcluster outperformed other existing methods in terms of both time and classification accuracy when tested on multi-omics datasets of breast invasive carcinoma, colon adenocarcinoma, and lung adenocarcinoma (LUAD).
BCC is a data-driven approach that performs consensus clustering across multi-omics datasets (Lock and Dunson, 2013). BCC is based on the finite Dirichlet mixture model to explain not only overall consensus clustering, but also important features inherent to an individual omics dataset. Given that clusters constructed using a single data type are roughly connected, BCC seeks an integrative point for their adherence to an overall cluster. BCC was applied to 384 breast cancer patients from TCGA datasets, including gene expression, DNA methylation, and protein data, and effectively revealed three cancer subtypes associated with specific clinical features.
Cancer research has taken advantage of advances in omics technologies from genomics to transcriptomics and of the wide range of resources of multiple omics datasets originating from the same patients. Multi-omics approaches provide a unique opportunity to identify the molecular and clinical features of cancer patients. In genomics and transcriptomics, there is an unmet need to disentangle incompatibility in related biological processes, such as differences in post-translational modifications or variability in expression profiles due to the role of mRNA transcripts in cancer development (Greenbaum et al., 2003; Hegde et al., 2003; Tyers and Mann, 2003). Recent advances in proteomics through the maturation of several mass spectrometry techniques have enabled the introduction of proteogenomic approaches, which can integrate genomic data with proteomics and information on post-translational modifications (e.g., protein phosphorylation and acetylation). Large-scale proteogenomic research, including that promoted by the CPTAC (Gillette et al., 2020; Krug et al., 2020; Mertins et al., 2016; Mun et al., 2019; Zhang et al., 2016), has been conducted to unravel new biological mechanisms in cancers and provide fundamental information on multi-omics approaches for the development of integration strategies or computational algorithms.
Multi-omics clustering further refined the association between molecular profiles and clinical features among cancer patients (Fig. 2). The identification of coherent subtypes across multiple dataset layers could have major implications for predicting clinical relevance or therapeutic response regardless of the overall tumor mutational load. Moreover, the integration of proteomics datasets enables the identification of a direct connection between mutations and phenotypes, and therefore increases the resolution of clustering patterns across samples. Here, we summarize the latest findings obtained in cancer research using multi-omics approaches.
Despite extensive research on its mutation signature and gene expression landscape, LUAD shows a high level of intrinsic or acquired resistance after treatment. Therefore, recent multi-omics-based efforts have been made to integrate genomic, transcriptomic, and proteomic datasets and decipher the molecular features underlying durable treatment responses.
Recently, the CPTAC has conducted a large-scale multi-omics study of LUAD by integrating WES, WGS, RNA-seq, miRNA and DNA methylation profiling, and high-resolution mass spectrometry-based proteomics, phosphoproteomics, and acetylproteomics. Integrative multi-omics clustering revealed four clusters of clinical and molecular features. For example, the patients in Cluster 1 were mostly
In another large-scale study, Chen et al. (2020) applied multi-omics approaches for early-stage, non-smoker patients in Taiwan using WES, RNA-seq, and proteomics datasets (Chen et al., 2020). Clustering was performed separately for proteomics, transcriptomics, and phosphoproteomics datasets, and clustering of proteomics data into three subtypes was chosen as the best representative of tumor staging and driver mutation classification. The largest group, Subtype 1, was composed of late-stage tumors (> II) with a high mutation rate, including in
Multi-omics analyses have increased our knowledge of breast cancer biology. In particular, integrative analyses have revealed the recurrence of mutations in the
An integrative analysis of gene expression and proteomics has been applied to the survival data of
A recent study on 122 patients integrating data on mutations, mRNA expression, protein expression, and post-translational modifications (phosphorylation and acetylation) has yielded robust profiles to elucidate the biological features of breast cancer (Krug et al., 2020). The resulting subtypes, that is, the basal-inclusive, HER2-inclusive, LumA-inclusive, and LumB-inclusive subtypes, were similar to those generated by the already existing and widely used PAM50 assay but revealed hidden biological structures such as the status of the
Multi-omics research on gastric cancers revealed four subtypes: 1) an Epstein–Barr virus subtype with recurrent
In highly characterized samples of glioblastoma patients, a multi-omics approach has delineated core transcriptional factors (CEBP and STAT3) that widely regulate mesenchymal transformation in glioblastoma (Carro et al., 2010). Integrative analyses of gene expression and phosphoproteomes have identified several cellular features that respond to stress and growth factors (Hill et al., 2012; Huang et al., 2013), are key regulators of the EGFR signaling pathway, and are associated with patient survival outcomes (Amit et al., 2007). Similarly, combining proteomic and metabolomic profiles also revealed a unique regulatory function in a cellular network of stress and growth factors (Bordbar et al., 2012). Dekker et al. (2020) conducted an integrative multi-omics analysis of gene and protein expression, as well as phosphoproteomic profiles, using paired primary recurrent tissue samples from eight glioblastoma patients (Dekker et al., 2020). Half of the patients showed a marked difference in the phosphorylation of STMN1 (S38), a component of the ERBB4 signaling pathway.
Integrating methylation profiles with genomic and transcriptomic datasets can substantiate the utility of studying acute myeloid leukemia (AML). A multi-omics analysis of 200 adult patients with AML showed distinct gene expression and methylation patterns across samples (Cancer Genome Atlas Research Network et al., 2013b). In particular, CpG-sparse regions showed a marked difference in methylation due to gene mutations. AML cells with
A multi-omics approach has also been applied to pancreatic ductal adenocarcinoma (PDAC) by integrating omics profiling of 150 patients for mutations, gene expression (mRNA, miRNA, and long non-coding RNA [lncRNA]), DNA methylation, and protein expression (Cancer Genome Atlas Research Network, 2017).
Drug target discovery is a critical step in the development of cancer drugs and personalized therapeutics. In traditional drug target discovery, biomolecules with a confirmed mechanism of action are selected through a series of studies, which require enormous manpower (Lindsay, 2003; Paananen and Fortino, 2020). Over the last decade, putative drug targets have been identified through the latest high-throughput genomic approaches in combination with experimental validation, including overexpression or knockdown by RNAi and the use of transgenic animals and model organisms (Benson et al., 2006). Multi-omics is an interdisciplinary approach to study biological characteristics, and can comprehensively yield many drug target candidates in a cost-effective manner. The analysis of 14 cancer subtypes from TCGA multi-omics datasets revealed 40 driver genes associated with the Wnt, Notch, Hedgehog, JAK/STAT, NK-KB, and MAPK signaling pathways (Chen et al., 2014). Among them, well-known driver genes such as
Multi-omics approaches may allow systematic assessment of drug discovery for personalized cancer therapy and improve the efficacy of chemotherapy (Aguirre et al., 2018; Li et al., 2013; Pauli et al., 2017). Refining molecular-defined subsets of patients can provide information on drug response and resistance, which vary among patients. Cui et al. (2020) integrated the expression of lncRNA, miRNA, mRNA, methylation, and the profile of somatic mutations with the expression of drug response-related lncRNAs. These authors found that lncRNAs respond to diverse chemotherapeutic drugs and characterized some key lncRNAs, such as HOXA-AS2, which mediate resistance to the drug adriamycin in BRCA patients (Cui et al., 2020). Another proteogenomic study of breast cancer found that triple-negative BRCA (TNBC) tumors with
In this review, we introduce computational methods for multi-omics studies and report the latest findings in cancer research based on them. Multi-omics approaches can fully characterize the intersection between different layers of quantitative information, systematically summarizing biological interactions from an individual cell or tissue to an individual patient with a primary tumor and possible metastases. In addition, such integration can reflect the molecular characteristics of tumors at various levels, from genes to proteins, and different cancer stages through multidisciplinary analysis.
Multi-omics approaches may hold the potential to study different cancer types with a high level of similarity, in terms of molecular characteristics, to basal-like breast cancer, high-grade serous ovarian cancer, and serous endometrial cancer (Cancer Genome Atlas Research Network et al., 2013a). A systems approach integrating multi-omics data is key to understanding cancer biology and investigating the molecular pathogenesis of cancer. Multi-omics data analysis across tumor types can identify molecular characteristics commonly underlying a range of cancer types and further detail patient subgroups as well as the molecular classification of cancer subtypes.
Therefore, multiple data layers, including genomics, transcriptomics, epigenomics, and proteomics datasets, are required to fully represent the molecular and clinical structures of cancer patients. The generation of high-quality and unbiased datasets is a critical part of multi-omics approaches. In addition, further studies should consider proper integration methods and computational algorithms for robust and systematic assessment to obtain solid findings and predictive models.
This work was supported by the Korean NRF Grant 2019M3E5D3073568 (to J.Y.A.) and a Korea University Grant.
Y.J.H. and J.Y.A. wrote the original draft. Y.J.H., C.H., G.H.L., J.M.P., and J.Y.A. reviewed and edited the manuscript. Y.J.H., C.H., and J.Y.A. provided a figure and table.
The authors have no potential conflicts of interest to disclose.
. List of computational frameworks for multi-omics cancer studies.
Study | Findings | Dataset | Principles |
---|---|---|---|
iCluster (Curtis et al., 2012; Shen et al., 2009) | Novel subgroups from 2,000 breast tumors | mRNA expression CNV | Joint latent variable model-based clustering method |
iOmicsPASS (Koh et al., 2019) | Novel transcriptional regulatory network from TCGA/CPTAC breast cancer data | mRNA expression CNV Protein expression | Network construction using a modified nearest shrunken centroid algorithm |
SALMON (Huang et al., 2019) | Improved survival analysis | Mutation mRNA/miRNA expression CNV | Deep learning based on co-expression modules |
SNF (Wang et al., 2014) | Subtype classification of clinical relevance | mRNA DNA methylation | Patient similarity networks using an iterative procedure based on message passing |
NEMO (Rappoport and Shamir, 2019) | Novel subtypes from even partial AML datasets | mRNA DNA methylation | Sample clustering from partial datasets using an adjusted Rand index |
MONET (Rappoport et al., 2020) | Module detection of patient subtypes and improved survival analysis | mRNA DNA methylation | Detect similar modules commonly present across multi-omics datasets |
PARADIGM (Vaske et al., 2010) | Detection of pathways affected by cancer with fewer false positives | mRNA expression CNV | Pathway recognition algorithm applied to multi-omics datasets |
LRAcluster (Wu et al., 2015) | Subtype detection in both pan-cancer analysis and single cancer types | Mutation mRNA expression CNV DNA methylation | Performance of low-rank approximation from probabilistic models |
BCC (Lock and Dunson, 2013) | Detection of patient subtypes in response to survival rates and driver mutation signatures | mRNA DNA methylation Protein expression | Bayesian framework for estimation of an integrative clustering model |
aGene expression data with normalization (e.g., quantile normalization, fragment per kilobase of transcript per million mapped reads [FPKM])..
bQuantification of miRNA expression..
cCircular binary segmentation-based copy number segmented means..
dAffymetrix 6.0 SNP arrays..
eProtein quantification by iTRAQ (isobaric Tags for Relative and Absolute Quantification) protein quantification..
fReverse phase protein array (RPPA)..
gIllumina Human Methylation arrays..
hIn the SALMON method, the copy number burden (CNB) is calculated using the total gene length (Kb) from SNP 6 data, and the tumor mutation burden (TMB) is calculated using the total number of mutated genes reported in Mutation Annotation Format (MAF) files..
iThe LRAcluster method uses somatic mutation data converted into a binary form..
Assim A. Alfadda, Afshan Masood, Mohammed Y. Al-Naami, Pierre Chaurand, and Hicham Benabdelkamel
Mol. Cells 2017; 40(9): 685-695 https://doi.org/10.14348/molcells.2017.0073Mohd M. Khan, Bao Quoc Tran, Yoon-Jin Jang, Soo-Hyun Park, William E. Fondrie, Khadiza Chowdhury, Sung Hwan Yoon, David R. Goodlett, Soo-Wan Chae, Han-Jung Chae, Seung-Young Seo, and Young Ah Goo
Mol. Cells 2017; 40(7): 466-475 https://doi.org/10.14348/molcells.2017.2298Won Kyong Cho, Tae Kyung Hyun, Dhinesh Kumar, Yeonggil Rim, Xiong Yan Chen, Yeonhwa Jo, Suwha Kim, Keun Woo Lee, Zee-Yong Park, William J. Lucas, and Jae-Yean Kim
Mol. Cells 2015; 38(8): 685-696 https://doi.org/10.14348/molcells.2015.0033