Mol. Cells 2014; 37(6): 457-466
Published online May 8, 2014
https://doi.org/10.14348/molcells.2014.0035
© The Korean Society for Molecular and Cellular Biology
Correspondence to : *Correspondence: biolab@snu.ac.kr
Proteomic analysis is helpful in identifying cancerassociated proteins that are differentially expressed and fragmented that can be annotated as dysregulated networks and pathways during metastasis. To examine metastatic process in lung cancer, we performed a proteomics study by label-free quantitative analysis and N-terminal analysis in 2 human non-small-cell lung cancer cell lines with disparate metastatic potentials?NCI-H1703 (primary cell, stage I) and NCI-H1755 (metastatic cell, stage IV). We identified 2130 proteins, 1355 of which were common to both cell lines. In the label-free quantitative analysis, we used the NSAF normalization method, resulting in 242 differential expressed proteins. For the N-terminal proteome analysis, 325 N-terminal peptides, including 45 novel fragments, were identified in the 2 cell lines. Based on two proteomic analysis, 11 quantitatively expressed proteins and 8 N-terminal peptides were enriched for the focal adhesion pathway. Most proteins from the quantitative analysis were upregulated in metastatic cancer cells, whereas novel fragment of CRKL was detected only in primary cancer cells. This study increases our understanding of the NSCLC metastasis proteome.
Keywords label-free quantitative analysis, metastasis, N-terminal analysis, non-small-cell lung cancer
Lung cancer is the leading cause of cancer-related deaths worldwide (30%) but constitutes only 15% of new cancer diagnoses (Parkin and Fernandez, 2006). Despite of the advances in cancer research, the 5-year survival rate of lung cancer remains low at 16%, compared with 65% for colon cancer, 89% for breast cancer, and 100% for prostate cancer (Jemal et al., 2010). Lung cancer is divided into 2 major histological types: small-cell lung cancer (SCLC) and non-small-cell lung cancer (NSCLC) (Hoffman et al., 2000). SCLC is commonly treated with chemotherapy and radiotherapy, and NSCLC is usually treated with surgery. Yet, surgery for NSCLC is effective only in those who are diagnosed at an early stage. More than 70% of NSCLC patients are diagnosed at the late stage with metastasis, resulting in a loss of opportunity for effective surgery and, ultimately, a poor prognosis (Tan et al., 2012).
Metastasis is a major cause of death from lung cancer that accompanies several processes, including the detachment of cancer cells, invasion of cancer cells into the surrounding tissue, and colonization of and proliferation in distant organs (Hwang et al., 2012; Tian et al., 2007). During metastasis, irreversible protein fragmentation occurs (Lopez-Otin and Bond, 2008). Dysregulation of protein fragment reactions in organs can cause pathological developmental disorders, such as cancer, inflammation, infection, and Alzheimer disease (Dawson and Dawson, 2003; Opferman and Korsmeyer, 2003; Rao, 2003).
In lung cancer, serum cytokeratin 19 fragments (CYFRA 21- 1) are generated by protein fragmentation reaction and have recently been implicated as a biomarker for the diagnosis and prognosis of NSCLC (Nisman et al., 2008). Pro1708/Pro2044 (the C-terminal fragment of albumin) (Kawakami et al., 2005) and HER2 rb2 (the ectodomain of human epithelial growth factor receptor-2) (Streckfus et al., 1999) are also cancer biomarkers that are generated by protein fragmentation. The identification of natural protease substrates and their cleavage sites is essential information with which we can understand the regulation of metastatic pathways. Thus, the pathways that culminate in protein fragment events must be examined to develop novel and more effective molecular markers and therapeutic targets.
Proteomic analysis for global protein identification is a powerful tool that can be used to identify novel biomarkers in various diseases. Of such methods, label-free quantification determines the expression levels of nontarget proteins (Fanayan et al., 2013). Many global quantitative proteomics studies have examined metastasis in various cancers, such as colorectal cancer (Xue et al., 2010), breast cancer (Xie et al., 2010), and hepatocellular carcinoma (Wang et al., 2011). However, there are few reports on the proteomic profile in metastatic lung cancer. For instance, Tian et al. identified metastasis-related proteins in NSCLC cell lines (nonmetastatic CL1-0 and the highly metastatic CL1-5) by 2-DE analysis (Tian et al., 2007).
The recent development of N-terminal peptide analysis, based on mass spectrometry, has enabled us to generate data on the protein targets and fragment sites (Brown and Hartley, 1966). To this end, several groups have established a method of identifying protease-generated (neo) peptides in cellular pathways, known as N-terminomics (Enoksson et al., 2007). Combined fractional diagonal chromatography (COFRADIC) is a pioneering technique in N-terminomics. Free amines of proteins are first acetylated prior to trypsin digestion and RP-HPLC fractionation. The N-termini of neo peptides are then derivatized with a hydrophobic reagent allow the original N-terminal peptides to be purified on rechromatography (Gevaert et al., 2003). However, the COFRADIC method requires many HPLC and LC-MS/MS runs and large amounts of starting material to select N-terminal neo peptides Mcdonald and Beynon (2006) developed a more rapid and simpler N-terminal peptide analysis method (positional proteomics) that is based on negative selection by chemical labeling of the α-amine in proteins.
In this study, to differentiate primary cancer cells from metastatic cells, we performed 2 parallel experiments: label-free quantification and N-terminal peptide analysis (positional proteomics methods) by LC-MS/MS. Human non-small-cell lung cancer cell lines were used?NCI-H1703, a stage I primary cancer cell, and NCI-H1755, a stage IV metastatic cancer line (Anisowicz et al., 2008). Our label-free quantification identified 2130 proteins from the LC-MS/MS analysis, 242 of which were differentially expressed between NCI-H1703 and NCI-H1755 cells. Analysis of N-terminal neo peptides identified 325 Nterminal peptides, 45 of which were observed in both cell lines. This differential expression of the proteome and N-terminal neo peptides can increase our understanding of differentially regulated pathways between primary and metastatic cancer cells in human non-small-cell lung cancer.
HPLC-grade water, HPLC-grade acetonitrile (ACN), and HPLCgrade methanol (MeOH) were obtained from FISHER (USA). Hydrochloric acid (HCl) and sodium chloride (NaCl) were purchased from DUKSAN (Korea). Urea and dithiothreitol (DTT) were purchased from AMRESCO (USA). Phenylmethanesulfonyl fluoride (PMSF), sodium dodecyl sulfate (SDS), and Tris were obtained from USB (USA). Complete protease inhibitor cocktail tablets were acquired from ROCHE (USA), and sequencing-grade modified trypsin was purchased from PROMEGA (USA). Sulfo-NHS acetate and NHS-Activated agarose slurry were obtained from Pierce (USA). All other reagents? iodoacetamide, α-cyano-4-hydroxycinnamic acid (CHCA), and trifluoroacetic acid (TFA)?were purchased from Sigma-Aldrich (USA).
Stage 1 (NCI-H1703) and stage 4 non-small-cell lung cancer cells (NCI-H1755) were obtained from the Korean Cell Line Bank. Both lines were cultured in RPMI1640 (WelGENE, Korea) with 10% fetal bovine serum (Gibco, USA), 100 U/ml penicillin and 100 μg/ml streptomycin (Gibco, USA) and 25 mM HEPES (Gibco, USA). The cultures were maintained in 95% humidified air and 5% CO2 at 37°C.
To prepare the cell lysates, cells were grown to 80% confluence and lysed in strong SDS-based buffer, containing 4% SDS, 0.1 mM PMSF, 1× protease inhibitor cocktail, 0.1 M DTT, and 0.1 M HEPES. Lysates were incubated at 95°C for 5 min and sonicated for 1 min. Supernatants were collected from the lysates by centrifugation at 15,000 × g for 20 min at 4°C. Protein concentrations were measured using the BCA Protein Assay Kit ? reducing reagent-compatible (Pierce, USA). Finally, each cell lysate was stored in 0.2-mg aliquot at -80°C until use.
Cell lysates were processed by filter-aided sample preparation (FASP) (Wisniewski et al., 2009) using a 10 K molecular weight cutoff (MWCO) filter (Millipore, USA). Briefly, 200 ?g of cell lysates in lysis buffer (4% SDS, 0.1 mM PMSF, 1× protease inhibitor cocktail, 0.1 M DTT, and 0.1 M HEPES) was transferred to the filter and mixed with 0.2 ml 8 M urea in 0.1 M HEPES, pH 7.5 (FASP solution). Samples were centrifuged at 14,000 × g at 20°C for 20 min. The samples in the filter were diluted with 0.2 ml FASP solution and centrifuged again. The reduced cysteines remained in 0.1 ml 50 mM iodoacetamide in FASP solution, were incubated at room temperature (RT) in the darkn for 30 min, and centrifuged for 20 min.
For the label-free quantification, alkylated samples were mixed with 0.2 ml 50 mM Tris solution and centrifuged at 14,000 × g at 20°C for 20 min; this step was repeated 3 times. One hundred microliters 50 mM Tris solution with trypsin (enzyme: protein ratio 1:80) was added to the resulting concentrate and incubated for 16 h at 37°C. Peptides were collected from the filter by centrifugation for 20 min to new collection tubes and acidified with 2% TFA.
Alkylated samples were mixed with 0.1 ml 50 mM HEPES with Sulfo-NHS acetate (Sulfo-NHS acetate:protein ratio at 25:1) and incubated for 2 h at RT. The samples were centrifuged at 14,000 × g at 20°C for 20 min, mixed with 0.2 ml 1 M Tris solution, and incubated on the filter for 4 h at RT. The samples were then centrifuged at 14,000 × g at 20°C for 20 min 4 times. One hundred microliters 50 mM Tris solution with trypsin (enzyme: protein ratio of 1:80) was added to the filter and incubated for 16 h at 37°C. Digested peptides were collected by centrifugation and acidified with 2% TFA.
Digested samples were desalted using in-house C18 StageTip desalting (STD) columns, as described (Han et al., 2012). Briefly, in-house C18 STD columns were prepared by reversedphase packing of POROS 20 R2 material into 0.2-ml yellow pipet tips that sat atop C8 empore disk membranes. The STD columns were washed with 0.1 ml 100% methanol and with 0.1 ml 100% ACN 3 times and equilibrated 3 times with 0.1 ml 0.1% TFA. After the peptides were loaded, the STD columns were washed 3 times with 0.1 ml 0.1% TFA, and the peptides were eluted with 0.1 ml of a series of elution buffers, containing 0.1% TFA and 40, 60, and 80% ACN. All eluates were combined and dried in a vacuum centrifuge.
Dried samples were dissolved in bupHTM PBS (Pierce, USA). One milliliter of an NHS-agarose bead slurry (50% slurry in acetone) was prepared per the manufacturer’s protocol (Pierce, USA). Briefly, acetone was removed from the slurry by centrifugation, and the slurry was washed 2 times with water and equilibrated 3 times with bupHTM PBS. After mixing with the equilibrated beads, the labeled samples were incubated for 4 h at RT. Finally, the beads were centrifuged at 1,000 × g for 30 s, and the supernatant was transferred to new tubes, acidified with 2% TFA, and desalted again.
Bovine serum albumin (BSA) peptides (Amresco, USA) were Nterminally labeled as described above as control. The peptides were dissolved in 10 ?l 0.1% TFA, and 0.5 μl of each sample was mixed with 0.5 μ of a matrix solution that contained 5 mg/ml CHCA (Sigma, USA), 70% ACN, and 0.1% TFA. The peptides were spotted directly onto a MALDI plate (Opti-TOFTM 384-well Insert, Applied Biosystems, USA) and crystallized with the matrix. Dried peptides were analyzed on a 4800 MALDITOF/ TOFTM Analyzer (Applied Biosystems) that was equipped with a 355-nm Nd:YAG laser. The pressure in the TOF analyzer was approximately 7.6 × e-07 Torr.
The mass spectra were obtained in the reflectron mode over an m/z range of 800-3500 Da with an accelerating voltage of 20. External calibration was performed using des-Arg-Bradykinin (904,468 Da), angiotensin 1 (1,296.685 Da), Glu-Fibrinopeptide B (1,570.677 Da), adrenocorticotropic hormone (ACTH) (1-17) (2,093.087 Da), and ACTH (18-39) (2,465.199) (4700 calibration mixture, Applied Biosystems). Raw data were reported by 4000 SERIES EXPLORER, v4.4 (Applied Biosystems).
All peptide samples were analyzed on an LTQ-Orbitrap Velos mass spectrometer (Thermo Scientific, USA) that was coupled to an EasyLC II (Proxeon Biosystems, Denmark), equipped with a nanoelectrospray device and fitted with a 10-?m fused silica emitter tip (New Objective, USA). Ten microliters of each samples was loaded onto a nano-LC trap column (ZORBAX 300SB-C18, 5 μm, 0.3 × 5 mm, Agilent, USA), and peptides were separated on a C18 analytical column (75 μm × 15 cm) that was packed in-house with C18 resin (Magic C18-AQ 200 ?, 5-μm particles). Solvent A was 98% water with 0.1% formic acid and 2% ACN, and Solvent B was 98% ACN with 0.1% formic acid and 2% water.
Peptides were separated using a 180-min gradient at 300 nl/min, comprising 0% to 40% B for 120 min, 40% to 60% B for 20 min, 60% to 90% B for 10 min, 90% B for 10 min, 90% to 5% B for 10 min, and 0% B for 10 min. The spray voltage was set to 1.8 kV, and the temperature of the heated capillary was 200°C. The mass spectrometer scanned a mass range of 300 to 2000. The data on the top 10 most abundant ions were analyzed in data-dependent scan mode over a minimum threshold of 1000. The normalized collision energy was adjusted to 35%, and the dynamic exclusion was set to a repeat count of 1, repeat duration of 30 s, exclusion duration of 60 s, and ± 1.5 m/z exclusion mass width. Each biological replicate was analyzed in triplicate.
After the data acquisition, data searches were performed using SEQUEST Sorcerer (Sage-N Research, USA). Raw files from the LTQ-Orbitrap Velos were converted into mzXML files using Trans-Proteomics Pipeline (TPP, ISB, USA). MS/MS data were searched using a target decoy database strategy against a composite database that contained the International Protein Index (IPI) human database (v3.87, 91,464 entries), and its reverse sequences were generated using Scaffold 3 (Proteome Software Inc., USA).
For the label-free quantification dataset and N-terminal peptide data, 2 independent search parameters were used. Parameters for the label-free quantification dataset were as follows: enzyme, full-trypsin; peptide tolerance, 10 ppm; MS/MS tolerance, 1.0 Da; variable modifications, oxidation (M); and static modifications, carbamidomethylation (Cys). Identified proteins were filtered using Scaffold 3, based on a minimum of 2 unique peptides and false discovery rate (FDR) < 1%. The parameters for N-terminal peptide dataset were as follows: enzyme, semi-arginine; peptide tolerance, 10 ppm; MS/MS tolerance, 1.0 Da; variable modifications, oxidation (Met); and static modifications, carbamidomethylation (Cys) and acetylation (N-term and Lys). Peptide-spectrum matches were filtered to have less than a 1% FDR by calculating the statistics tool in TPP.
The label-free quantitative analysis of peptides was performed by spectral counting analysis. To calculate a protein spectrum count, we exported the numbers of peptides that were assigned to each protein from Scaffold 3. Exported data were analyzed by normalized spectral abundance factor (NSAF) method to normalize run-to-run variations (Zybailov et al., 2006). NSAF values were calculated as:
NSAF = (SpC / Mw) / ∑ (SpC / Mw) n
where SpC is the spectral count, Mw is the molecular weight in kDa, and n is the total number of proteins. Because some expression ratios that are calculated from spectral counts of 0, causing certain data to be represented as ‘#DIV/0!’ in Microsoft Office Excel 2010, we shifted the entire spectral count equally by adding 0.1 to the original values. By NSAF method, we could compare expression levels and apply independent 2-sample
Data were analyzed using various bioinformatics tools. To determine N-terminal peptide sites, we performed manual annotations using UniProtKB (Universal Protein Resource Knowledgebase) (
The biological process and molecular function classifications of identified proteins were analyzed using PANTHER ID numbers (
To differentiate the proteomic changes between primary and metastatic cells, whole-cell lysates of cultured human nonsmall-cell lung cancer cell lines (NCI-H1703 and NCI-H1755) were analyzed in parallel experiments, as depicted in Fig. 1. Each cell line was cultured as 3 independent biological replicates and prepared by FASP.
For the label-free quantitative proteomic analysis, cell lysates were digested with trypsin and desalted with a C18 in-house stage tip prior to LTQ-Orbitrap Velos analysis. To ensure the reliability of the quantitative profiling, each sample was injected in triplicate (3 technical replicates) for each biological replicate. A total of 18 raw files from the LTQ-Orbitrap Velos were processed in Scaffold 3 with the SEQUEST algorithm.
To analyze the N-terminal peptide data, free amines in the cell lysates were labeled by NHS-acetate. The remaining NHSacetate was quenched by the amine group of Tris. N-terminally labeled proteins were digested with trypsin and desalted using C18 in-house stage tips and filtered by NHS-activated beads that depleted the newly generated N-termini by trypsin. The superC18 in-house stage tips again. To profile the N-terminal peptides, the samples were analyzed in triplicate (3 technical replicates) for each biological replicate. A total of 18 raw data files were then processed in SEQUEST and TPP. All data from the wholecell lysates and N-terminal peptides were classified using informatics tools.
Samples were prepared by FASP, and LC-MS/MS analysis was performed using the LTQ-Orbitrap Velos. MS/MS data were acquired for the biological and technical triplicates for each cell line and processed to identify peptides that generated the observed spectra, and proteins were inferred, based on the identified peptides. Because the MS/MS spectral counts for peptides from shotgun proteomic approaches have recently been shown estimate protein abundance well, we performed a label-free quantitative analysis of NSCLC cell lines, based on a shotgun proteomics strategy and spectral counting techniques.
A total of 18 raw files from the 2 cell lines were combined into a single merged output file in Scaffold 3, in which the analysis was restricted to proteins with at least 2 unique peptides and an FDR < 0.5%. Per these criteria, we reproducibly identified 2130 non redundant proteins (Fig. 2A and Supplementary Table S1), 28% of which was identified by 2 unique peptides, whereas 17% was identified by 3 unique peptides, 11% was identified by 4 unique peptides, and 44% was identified by more than 5 unique peptides (Fig. 2B).
We classified all identified proteins by gene ontology (GO) analysis as biological process and molecular function. Many proteins mapped to the GO terms “protein metabolism and modification” (309 proteins), “intracellular protein traffic” (213 proteins), “protein biosynthesis” (147 proteins), “cell structure and motility” (147 proteins), and “cell cycle in biological process” (95 proteins) (Fig. 2C). Notably, molecular functions were assigned many proteins: 493 proteins were annotate with the GO term “nucleic acid binding,” 157 proteins were related to cytoskeletal protein,” 123 proteins fell under “dehydrogenase,” and 85 proteins were “membrane traffic proteins” (Fig. 2D) (Supplementary Table S1).
To quantify the identified proteins by spectral count, we used normalized spectral abundance factors (NSAF), with which the total number of spectra of an identified protein in each LCMS/ MS run correlates well with the abundance of the corresponding protein over a wide linear dynamic range (Zybailov et al., 2006). High-confidence proteins for label-free quantitation were selected with an average spectral count ≥ 5 in 9 datasets (3 technical and 3 biological replicate) in either cell line. Also, missing values from each dataset were exchanged with a value of 0. Of the 2130 identified proteins, 671 satisfied our label-free quantitative protein criteria (Supplementary Table S2).
The distribution of the ratio correlation between NCI-H1703 and NCI-H1755 in the 3 biological replicates was selectively plotted, as shown in Supplementary Fig. S1A, in which 3 distributions had high similarity. To determine the fold-change in expression for each protein between the 2 cell lines, the standard deviation of the 671 quantitative proteins were calculated for the 3 biological replicates, indicating that approximately 90% fell within 0.5 standard deviation (Supplementary Fig. S1B) (Kim et al., 2012). The differential expression ratios for the 671 protein groups are shown in Supplementary Fig. S1C, in which ratios ≥ 1.5-fold are shadowed. The expression of 242 proteins changed ≥ 1.5-fold between NCI-H1703 and NCI-H1755 cells; 92 proteins were upregulated, and 150 proteins were downregulated. For example, integrin alpha-2 (ITGA2), aldehyde dehydrogenase, mitochondrial (ALDH2), UDP-glucose 4-epimerase (GALE), and aldose reductase (AKR1B1) were preferentially expressed in NCI-H1755 cells. Conversely, alpha-internexin (INA), isoform 1 of myosin-10 (MYH10), isoform 3 of UDP-Nacetylhexosamine pyrophosphorylase (UAP1), and isoform 1 of protein AHNAK2 (AHNAK2) were significantly downregulated in NCI-H1755 cells (Table 1 and Supplementary Table S3).
The scheme with which N-terminal peptides were identified is shown in Fig. 3. The N-termini of proteins are characterized by an α-amine, as opposed to the ε-amines that are on lysine side chains. Thus, ε-amines on lysine side chains had to be blocked. We blocked the α-amine and ε-amine groups by acetylation using NHS-acetate. After a quenching step, the unbound NHSacetate was depleted by the amine in Tris. Next, proteins were digested with trypsin, generating N-terminal peptides with free amino groups. Then, we added NHS-activated beads, which bind free amine groups in newly generated N-terminal peptides by trypsin, whereas natural N-terminal peptides are blocked by acetylation (McDonald and Beynon, 2006).
In a control experiment, we examined whether this scheme could identify the natural N-termini of bovine serum albumin (BSA). Precursor BSA comprises 607 amino acids, whereas the mature form of BSA contains 583 amino acids, lacking residues 1-24 (Weijers, 1977). Thus, our BSA had an aspartic acid at residue 25 as its natural N-terminus.
Acetylated BSA was digested with trypsin and analyzed by MALDI-MS (Supplementary Fig. S2A). The observed peptide masses were consistent with the expected Arg-C-specific digestion of BSA (acetylated lysine is resistant to tryptic cleavage) and included the known N-terminal peptide (Ac-DTHK(ac)SEIAHR) at 1277.6 m/z. As expected, a range of lysine-containing peptides appeared, increasing by 42.03 Da per lysine. On removal of newly generated BSA peptides by tryptic digestion by NHS-activated beads, we detected a single major peak at 1277.6 m/z by mass spectrometry. The N-terminal peptide of BSA had 1 peak that was mass-shifted by the acetylation of α-amine and ε-amine and confirmed with the peptide fingerprint by MS/MS analysis (Supplementary Fig. S2B).
N-terminal peptides were identified in the 2 cell lines by positional proteomics analysis, as described (McDonald and Beynon, 2006). All samples were analyzed with 3 biological and technical replicates, and 307 unique proteins (272 peptides from 261 proteins in NCI-H1703 and 233 peptides from 220 proteins in NCI-H1755) were identified with more than 2 hits in the biological replicate analysis, with > 95% peptide probability and FDR < 1%. Ultimately, 92 unique N-terminal peptides were identified in NCI-H1703 cells compared to 53 in the NCI-H1755 cells (Supplementary Figs. S3A and S3B; Supplementary Table S4).
We analyzed the biological process and molecular function of the identified proteins. With regard to biological process, many proteins were enriched for the GO terms “protein metabolism and modification,” “protein biosynthesis,” and “mRNA splicing.” Many proteins mapped to the molecular function GO terms “nucleic acid binding” (62 proteins), “ribosomal protein” (30 proteins), and “chaperone in molecular function” (18 proteins) (Supplementary Figs. S3C and S3D).
The identified N-terminal peptides were divided into natural N-terminus and novel N-terminal neo peptides. Most proteins undergo systematic depletion of their natural N-termini to function. For example, certain proteins have their signal peptides excised from the N-terminus to be secreted. Thus, natural Ntermini were grouped into 5 types, based on molecule processing part of each protein sequence annotation in UniProtKB: initial methionine depletion, initial methionine nondepletion signal peptide depletion, propeptide depletion, and mitochondrial transit peptide depletion. Except for these natural N-termini, the newly identified peptides in the N-terminus analysis were annotated as novel N-terminal neo peptides that have not been assigned in the UniprotKB database.
A total of 325 unique N-terminal peptides were classified into 6 categories with regard to distributions of N-terminal peptides in NCI-H1703 and NCI-H1755 cells (Figs. 4A and 4B): (1) initial methionine depletion, NCI-H1703 (169 peptides, 62.1%) and NCI-H1755 (148 peptides, 63.5%); (2) initial methionine nondepletion, NCI-H1703 (37 peptides, 13.6%) and NCI-H1755 (28 peptides, 12.1%); (3) signal peptide depletion, NCI-H1703 (15 peptides, 5.5%) and NCI-H1755 (10 peptides, 4.3%); (4) propeptide depletion, NCI-H1703 (1 peptides, 0.4%) and NCIH1755 (1 peptides, 0.4%); (5) mitochondrial transit peptide depletion, NCI-H1703 (17 peptides, 6.3%) and NCI-H1755 (16 peptides, 6.9%); and (6) novel N-terminal neo peptide, NCIH1703 (33 peptides, 12.1%) and NCI-H1755 (30 peptides, 12.9%) (Supplementary Table S4).
We performed a pathway analysis of differentially expressed proteins and identified N-terminal peptides in the 2 cell lines. To define the related pathways, all proteins in the lists were subjected to KEGG pathway analysis (Supplementary Fig. S4). Fourteen proteins were involved in the focal adhesion pathway in relation of cell invasion, growth, proliferation, and migration (Supplementary Table S5), 5 of which (FLNA, FLNB, CAV1, MYL12B, and CAPN2) were common in the two parallel experiments. Three proteins?CRKL, PPP1CB, and MAPK3?were identified only in the N-terminal peptide analysis, and 6 proteins (VASP, VCL, RHOA, ACTN4, MAPK1, and ITGA2) appeared in the label-free quantitative analysis. Thirteen of the 14 focal adhesion proteins?except FLNA, which contained a novel Nterminal neo peptide (PATEKDLAEDAPWKKIQQNTFTR) in the NCI-H1703 and NCI-H1755 lines?showed differential expression in both cell lines in at least 1 experiments (Supplementary Table S5 and Fig. 5).
Six proteins (ITGA2, FLNA, FLNB, CAPN2, ACTN4, and MAPK1) were upregulated in metastatic lung cancer cells by label-free quantification analysis versus 3 downregulated proteins (RHOA, VASP, and VCL); 2 proteins (CAV1 and MY12B) were not differentially expressed. Three proteins (CRKL, PPP1CB, and MAPK3) were identified only in the N-terminal peptide analysis, in which we identified a fragment (novel Nterminal neo peptide) from CRKL in NCI-H1703 cells and methionine- depleted N-terminal peptides from PPP1CB and MAPK3 at the initial N-terminus. Protein phosphatase 1 (PPP1CB) is overexpressed in lung cancer (Liu et al., 2007) and is activated by phosphorylation. Although PPP1CB was detected by N-terminal peptide analysis only in NCI-H1755 cells, we excluded in subsequent analyses, due to the lack of phosphorylation data in this analysis.
Most NSCLC patients develop metastases, resulting in incurable disease at the time of diagnosis. Despite the advances in cancer research, there are few biomarkers for early-stage cancer, and our understanding of metastasis is poor (Tan et al., 2012). Also, metastasis has become the chief obstacle to the treatment of lung cancer. Thus, it will be helpful to determine the mechanisms of metastasis. To this end, our study has generated phenotypic data from primary and metastatic NSCLC using NCI-H1703 and NCI-H1755 cells, respectively.
Label-free quantitative analysis, based on MS1 peak intensities (Domon and Aebersold, 2006) and MS/MS spectral counts (Liu et al., 2004), is valuable in the large-scale analysis of proteins and peptides. General analysis of spectral counts has a limit of quantitation for low-abundance proteins (≤ 4 spectrum detected) and post translational modification proteins (Freund and Prenni, 2013). However, the analysis is suitable for detection of subtle abundance changes in most proteins with high sensitivity and reproducibility (Old et al., 2005).
In this study, we identified 2130 nonredundant proteins with 218,323 spectra by cell lysate profiling at a minimum of 2 distinct peptides per protein, based on an FDR of 0.3%. We also required 5 or more spectral counts for the identifications, for which spectral counts were normalized by NSAF. Lastly, 671 proteins were used for the label-free quantification, which allowed us to identify differentially expressed proteins (
Of the 242 differentially expressed proteins, transaldolase (TALDO1) is a novel serum biomarker for a model hepatocellular carcinoma (HCC) metastasis and HCC patients (Wang et al., 2011). TALDO1 was overexpressed in NCI-H1755 versus NCIH1703 cells. Dipanjana et al. reported global proteomic alterations in colorectal cancer cell metastasis, 8 proteins of which were consistent with our dataset; 3 upregulated proteins (ALDH2, HSP90B1, and PDIA4) and 5 downregulated proteins (EIF2S2, MCM6, MCM7, PSMC1, and PSMC2) (Ghosh et al., 2011).
Many proteins, such as isoform 2 of filamin-A (FLNA), isoform 1 of filamin-B (FLNB), isoform A of prelamin-A/C (LMNA), and vimentin (VIM), which were classified as the GO term “cell structure and motility,” were upregulated in the metastatic NCIH1755 line (Supplementary Table S1). In particular, LMNA is a metastatic biomarker of colorectal cancer cells (Willis et al., 2008) and a marker of embryonic stem cell differentiation (Constantinescu et al., 2006), although this status not been reported in NSCLC metastasis.
Cell proliferation molecules, such as isoform 1 of protein CDV3 homolog (CDV3), isoform 1 of epidermal growth factor receptor (EGFR), and histone-binding protein RBBP7 (RBBP7), were downregulated in the NCI-H1755 cells. Conversely, isoform 1 of annexin A7 (ANXA7), 60-kDa heat shock protein mitochondrial (HSPD1), proliferating cell nuclear antigen (PCNA), and isoform 3 of thioredoxin reductase 1 cytoplasmic (TXNRD1) were upregulated in this line. ANXA7 is a biomarker of progression in prostate and breast cancer (Srivastava et al., 2001); we also noted a 1.7-fold increase in NCI-H1755 cells.
Protein fragment reaction linked to cancer metastasis. Several studies have demonstrated that potential cancer biomarkers, such as HER2 rb2 and CYFRA 21-1, are generated by protein fragmentation (Pujol et al., 1993; Streckfus et al., 2000). For example, CYFRA 21-1 that is protein fragment is known relation with lung cancer metastasis, although it is not a specific marker for lung cancer diagnosis. In searching for markers that are elicited by protein fragmentation, we identified new generated N-terminal peptides using positional proteomics methods. In brief, natural N-termini are blocked by certain labeling methods, such as acetylation (McDonald and Beynon, 2006), dimethylation (Hsu et al., 2003), iTRAQ (Prudova et al., 2010), and PITC adman (Dugaiczyk et al., 1982). In our study, N-termini were labeled by acetylation, based on its simplicity and high labeling efficiency. Ultimately, we identified 27 novel N-terminal neo peptides that were differentially generated between metastatic cells and primary cancer cells. Notably, natural cleavage of Nterminal peptides, such as initial methionine depletion, signal peptide depletion, propeptide depletion, and transit peptide depletion, were also detected and annotated using the Uniprot database (Apweiler et al., 2004). Specifically, of the initial methionine- depleted proteins, we identified 44 proteins that do not exist in the UniprotKB database.
In the N-terminal peptide analysis, 92 peptides from 87 pro-teins were detected in NCI-H1703 cells, whereas 53 peptides from 46 proteins were identified in NCI-H1755 cells (Supplementary Fig. S3)?27 peptides were categorized as novel Nterminal neo peptides (like the fragment peptides), and 15 novel N-terminal neo peptides appeared only in NCI-H1703 cells. Notably, EPH receptor A2 (EPHA2) is a marker of NSCLC progression (Brannan et al., 2009), and a novel N-terminal neo peptide of EPHA2 was detected in primary cancer cells. However, EPHA2 was observed in both cell lines by label-free quantitative analysis (not used for quantification due to a spectral count below 5).
Five proteins were identified with fragment N-terminal peptides, whereas their expression did not differ by label-free quantification analysis (Table 2). Four of them?DDX3X, RPL4, RPL30, and XRCC6?were observed only in NCI-H1703 cells by N-terminal peptide analysis, whereas SHMT2 was detected only in NCI-H1755 cells. Further, four proteins (DDX3X, RPL4, RPL30, and XRCC6) are associated with cell proliferation and differentiation in metastasis (Bauer et al., 2012; Li et al., 2011; Yoon et al., 2006). In this study, the four proteins that were identified with novel N-terminal neo peptides were expressed in equal amounts in the cell lines, but they could not affect the metastasis of primary cancer cells (NCI-H1703).
We found 138 proteins that were common to both experiments (Supplementary Table S6). Most proteins, including natural N-terminal peptides that were differentially identified by Nterminal analysis, except for histone-binding protein RBBP7 (RBBP7), were consistent with their expression levels in the label-free quantification analysis. For example, creatine kinase B-type (CKB) was identified with initial methionine-depleted Ntermini only in NCI-H1703 cells by N-terminal analysis, whereas CKB was significantly upregulated in NCI-H1703 cells by labelfree quantitative analysis.
In the classification of the 138 commonly identified proteins by KEGG pathway, the proteins were primarily involved in aminoacyl- tRNA biosynthesis, the pentose phosphate pathway, the proteasome, arginine and proline metabolism, DNA replication, and focal adhesion (Supplementary Fig. S4). Focal adhesion is a major pathway of cancer metastasis, and we identified 15 proteins that were related to focal adhesion in the 2 profiling experiments (Fig. 5 and Supplementary Table S5). Of the 138 proteins, 11 proteins, identified by label-free quantification analysis, participated in focal adhesion?6 proteins were upregulated, 3 proteins were downregulated, and 2 proteins were not differentially expressed. Conversely, of the proteins that were identified by N-terminal peptide analysis, 8 were involved in focal adhesion.
Integrin alpha-2 (ITGA2) was upregulated by 2.4-fold in NCIH1755 cells. Apparently, ITGA2 mediates metastasis to the liver by regulating the focal adhesion pathway (Yoshimura et al., 2009). Overexpression of integrin proteins (ITGA and ITGB) initiates a signaling cascade to alpha-actinin-4 (ACTN4), FLNA, FLNB, and FAK (not identified in our data) to effect cell proliferation and growth (Shibue and Weinberg, 2009) (Fig. 5). Notably, ACTN4, FLNA, and FLNB were overexpressed in NCIH1755 cells in this study. In addition, MAPK1 (also known as ERK2), upregulated in metastatic cells, is a point at which multiple biochemical signals integrate (Wu et al., 2008) (Fig. 5).
MAP kinases mediate many processes in cancer cells, such as proliferation, migration, invasion, and metastasis (Obchoei et al., 2011). Increased expression of MAPK1 promotes the expression of CAPN2, which functions in cell movement, migration, and invasion during metastasis (Storr et al., 2011). In the N-terminal peptide analysis, v-crk sarcoma virus CT10 oncogene homolog (avian)-like (CRKL) was identified as a novel Nterminal neo peptide only in NCI-H1703 cells. Because CRKL activates ERK signaling to promote cell proliferation, survival, and invasion in lung cancer (Kim et al., 2010), we hypothesize that CRKL function is regulated by fragment events during metastasis.
In summary, we have identified differentially expressed proteins that distinguish primary and metastatic lung cancer. Many of these quantitative proteins and N-terminal peptides are involved in pathways in cell migration, proliferation, and metastasis. Thus, our datasets of proteins and fragment peptides in lung cells might be valuable in discovering and validating lung cancer biomarkers and metastasis markers.
Top 15
IPIa | MW (kDa) | Ratiob | p-valuec | Gene symbol | Protein name |
---|---|---|---|---|---|
IPI00013744 | 129.3 | 6.6 | 0.0001 | ITGA2 | Integrin alpha-2 |
IPI00006663 | 56.4 | 6.57 | 0.0004 | ALDH2 | Aldehyde dehydrogenase, mitochondrial |
IPI00553131 | 38.3 | 5.9 | 0.0003 | GALE | UDP-glucose 4-epimerase |
IPI00413641 | 35.9 | 3.81 | 0.0022 | AKR1B1 | Aldose reductase |
IPI00216008 | 62.5 | 3.35 | 0.0023 | G6PD | Isoform Long of Glucose-6-phosphate 1-dehydrogenase |
IPI00017376 | 86.5 | 2.86 | 0.0015 | SEC23B | Protein transport protein Sec23B |
IPI00215743 | 152.5 | 2.6 | 0.0001 | RRBP1 | Isoform 3 of Ribosome-binding protein 1 |
IPI00001539 | 41.9 | 2.38 | 0.0009 | ACAA2 | 3-ketoacyl-CoA thiolase, mitochondrial |
IPI00292771 | 238.3 | 2.23 | 0.0018 | NUMA1 | Isoform 1 of Nuclear mitotic apparatus protein 1 |
IPI00744692 | 37.5 | 2.22 | 0.0000 | TALDO1 | Transaldolase |
IPI00643920 | 68.8 | 2.04 | 0.0016 | TKT | cDNA FLJ54957, highly similar to Transketolase |
IPI00414717 | 134.6 | 2.03 | 0.0223 | GLG1 | Isoform 2 of Golgi apparatus protein 1 |
IPI00219525 | 51.9 | 2.01 | 0.0001 | PGD | 6-phosphogluconate dehydrogenase, decarboxylating |
IPI00027223 | 46.6605 | 2.01 | 0.0010 | IDH1 | Isocitrate dehydrogenase [NADP] cytoplasmic |
IPI00003479 | 41.3919 | 2 | 0.0082 | MAPK1 | Mitogen-activated protein kinase 1 |
IPI00001453 | 55.4 | -7.47 | 0.0055 | INA | Alpha-internexin |
IPI00397526 | 230.8 | -6.87 | 0.0039 | MYH10 | Isoform 1 of Myosin-10 |
IPI00607787 | 58.7 | -6.64 | 0.0041 | UAP1 | Isoform 3 of UDP-N-acetylhexosamine pyrophosphorylase |
IPI00856045 | 616.6 | -6.59 | 0.0028 | AHNAK2 | Isoform 1 of Protein AHNAK2 |
IPI00333619 | 54.8 | -6.52 | 0.0027 | ALDH3A2 | Isoform 1 of Fatty aldehyde dehydrogenase |
IPI00178150 | 139.9 | -6.36 | 0.0083 | KIF4A | Isoform 1 of Chromosome-associated kinesin KIF4A |
IPI00237884 | 181.0 | -6.26 | 0.0416 | AKAP12 | Isoform 1 of A-kinase anchor protein 12 |
IPI00218775 | 51.2 | -6.12 | 0.0051 | FKBP5 | Peptidyl-prolyl cis-trans isomerase FKBP5 |
IPI00023972 | 50.6 | -6.11 | 0.0041 | DDX47 | Probable ATP-dependent RNA helicase DDX47 |
IPI00003505 | 48.6 | -5.96 | 0.0039 | TRIP13 | Isoform 1 of Pachytene checkpoint protein 2 homolog |
IPI00396627 | 92.1 | -5.95 | 0.0118 | ELAC2 | Isoform 1 of Zinc phosphodiesterase ELAC protein 2 |
IPI00022977 | 42.6 | -5.89 | 0.0085 | CKB | Creatine kinase B-type |
IPI00294187 | 75.6 | -5.89 | 0.0008 | PADI2 | Protein-arginine deiminase type-2 |
IPI00017303 | 104.7 | -5.89 | 0.0201 | MSH2 | DNA mismatch repair protein Msh2 |
IPI00218922 | 88.0 | -5.77 | 0.0110 | SEC63 | Translocation protein SEC63 homolog |
aIPI accession number of each protein
bSignificant difference expression log2 ratio of NCI-H1755/NCI-H1703 with NSAF value
cSignificant difference in t-test (
Proteolytic events identified with less than 1.5 fold change
IPI | Peptide sequencea | Ratiob | N-terminal analysisc | Gene symbol | Protein name |
---|---|---|---|---|---|
IPI00215637 | N. | -0.48 | NCI-H1703 | DDX3X | ATP-dependent RNA helicase DDX3X |
IPI00003918 | R. | -0.37 | NCI-H1703 | RPL4 | 60S ribosomal protein L4 |
IPI00219156 | V. | -0.15 | NCI-H1703 | RPL30 | 60S ribosomal protein L30 |
IPI00644712 | R. | 0.14 | NCI-H1703 | XRCC6 | X-ray repair cross-complementing protein 6 |
IPI00002520 | Q. | 0.30 | NCI-H1755 | SHMT2 | Serine hydroxymethyltransferase, mitochondrial |
aObserved peptide sequence from N-terminal peptide analysis is written by italics.
bExpression log2 ratio of NCI-H1755/NCI-H1703 with NSAF value by label-free analysis
cCell line with detected peptide sequences from N-terminal analysis
Mol. Cells 2014; 37(6): 457-466
Published online June 30, 2014 https://doi.org/10.14348/molcells.2014.0035
Copyright © The Korean Society for Molecular and Cellular Biology.
Hophil Min1, Dohyun Han1,2, Yikwon Kim1, Jee Yeon Cho3, Jonghwa Jin1, and Youngsoo Kim*,1,2
1Department of Biomedical Sciences, Medical Research Center, Seoul National University College of Medicine, Seoul 110-799,Korea, 2Institute of Medical and Biological Engineering, Medical Research Center, Seoul National University College of Medicine, Seoul 110-799,Korea, 3Division of Life Sciences and Biotechnology, Korea University, Seoul 136-701,Korea
Correspondence to:*Correspondence: biolab@snu.ac.kr
Proteomic analysis is helpful in identifying cancerassociated proteins that are differentially expressed and fragmented that can be annotated as dysregulated networks and pathways during metastasis. To examine metastatic process in lung cancer, we performed a proteomics study by label-free quantitative analysis and N-terminal analysis in 2 human non-small-cell lung cancer cell lines with disparate metastatic potentials?NCI-H1703 (primary cell, stage I) and NCI-H1755 (metastatic cell, stage IV). We identified 2130 proteins, 1355 of which were common to both cell lines. In the label-free quantitative analysis, we used the NSAF normalization method, resulting in 242 differential expressed proteins. For the N-terminal proteome analysis, 325 N-terminal peptides, including 45 novel fragments, were identified in the 2 cell lines. Based on two proteomic analysis, 11 quantitatively expressed proteins and 8 N-terminal peptides were enriched for the focal adhesion pathway. Most proteins from the quantitative analysis were upregulated in metastatic cancer cells, whereas novel fragment of CRKL was detected only in primary cancer cells. This study increases our understanding of the NSCLC metastasis proteome.
Keywords: label-free quantitative analysis, metastasis, N-terminal analysis, non-small-cell lung cancer
Lung cancer is the leading cause of cancer-related deaths worldwide (30%) but constitutes only 15% of new cancer diagnoses (Parkin and Fernandez, 2006). Despite of the advances in cancer research, the 5-year survival rate of lung cancer remains low at 16%, compared with 65% for colon cancer, 89% for breast cancer, and 100% for prostate cancer (Jemal et al., 2010). Lung cancer is divided into 2 major histological types: small-cell lung cancer (SCLC) and non-small-cell lung cancer (NSCLC) (Hoffman et al., 2000). SCLC is commonly treated with chemotherapy and radiotherapy, and NSCLC is usually treated with surgery. Yet, surgery for NSCLC is effective only in those who are diagnosed at an early stage. More than 70% of NSCLC patients are diagnosed at the late stage with metastasis, resulting in a loss of opportunity for effective surgery and, ultimately, a poor prognosis (Tan et al., 2012).
Metastasis is a major cause of death from lung cancer that accompanies several processes, including the detachment of cancer cells, invasion of cancer cells into the surrounding tissue, and colonization of and proliferation in distant organs (Hwang et al., 2012; Tian et al., 2007). During metastasis, irreversible protein fragmentation occurs (Lopez-Otin and Bond, 2008). Dysregulation of protein fragment reactions in organs can cause pathological developmental disorders, such as cancer, inflammation, infection, and Alzheimer disease (Dawson and Dawson, 2003; Opferman and Korsmeyer, 2003; Rao, 2003).
In lung cancer, serum cytokeratin 19 fragments (CYFRA 21- 1) are generated by protein fragmentation reaction and have recently been implicated as a biomarker for the diagnosis and prognosis of NSCLC (Nisman et al., 2008). Pro1708/Pro2044 (the C-terminal fragment of albumin) (Kawakami et al., 2005) and HER2 rb2 (the ectodomain of human epithelial growth factor receptor-2) (Streckfus et al., 1999) are also cancer biomarkers that are generated by protein fragmentation. The identification of natural protease substrates and their cleavage sites is essential information with which we can understand the regulation of metastatic pathways. Thus, the pathways that culminate in protein fragment events must be examined to develop novel and more effective molecular markers and therapeutic targets.
Proteomic analysis for global protein identification is a powerful tool that can be used to identify novel biomarkers in various diseases. Of such methods, label-free quantification determines the expression levels of nontarget proteins (Fanayan et al., 2013). Many global quantitative proteomics studies have examined metastasis in various cancers, such as colorectal cancer (Xue et al., 2010), breast cancer (Xie et al., 2010), and hepatocellular carcinoma (Wang et al., 2011). However, there are few reports on the proteomic profile in metastatic lung cancer. For instance, Tian et al. identified metastasis-related proteins in NSCLC cell lines (nonmetastatic CL1-0 and the highly metastatic CL1-5) by 2-DE analysis (Tian et al., 2007).
The recent development of N-terminal peptide analysis, based on mass spectrometry, has enabled us to generate data on the protein targets and fragment sites (Brown and Hartley, 1966). To this end, several groups have established a method of identifying protease-generated (neo) peptides in cellular pathways, known as N-terminomics (Enoksson et al., 2007). Combined fractional diagonal chromatography (COFRADIC) is a pioneering technique in N-terminomics. Free amines of proteins are first acetylated prior to trypsin digestion and RP-HPLC fractionation. The N-termini of neo peptides are then derivatized with a hydrophobic reagent allow the original N-terminal peptides to be purified on rechromatography (Gevaert et al., 2003). However, the COFRADIC method requires many HPLC and LC-MS/MS runs and large amounts of starting material to select N-terminal neo peptides Mcdonald and Beynon (2006) developed a more rapid and simpler N-terminal peptide analysis method (positional proteomics) that is based on negative selection by chemical labeling of the α-amine in proteins.
In this study, to differentiate primary cancer cells from metastatic cells, we performed 2 parallel experiments: label-free quantification and N-terminal peptide analysis (positional proteomics methods) by LC-MS/MS. Human non-small-cell lung cancer cell lines were used?NCI-H1703, a stage I primary cancer cell, and NCI-H1755, a stage IV metastatic cancer line (Anisowicz et al., 2008). Our label-free quantification identified 2130 proteins from the LC-MS/MS analysis, 242 of which were differentially expressed between NCI-H1703 and NCI-H1755 cells. Analysis of N-terminal neo peptides identified 325 Nterminal peptides, 45 of which were observed in both cell lines. This differential expression of the proteome and N-terminal neo peptides can increase our understanding of differentially regulated pathways between primary and metastatic cancer cells in human non-small-cell lung cancer.
HPLC-grade water, HPLC-grade acetonitrile (ACN), and HPLCgrade methanol (MeOH) were obtained from FISHER (USA). Hydrochloric acid (HCl) and sodium chloride (NaCl) were purchased from DUKSAN (Korea). Urea and dithiothreitol (DTT) were purchased from AMRESCO (USA). Phenylmethanesulfonyl fluoride (PMSF), sodium dodecyl sulfate (SDS), and Tris were obtained from USB (USA). Complete protease inhibitor cocktail tablets were acquired from ROCHE (USA), and sequencing-grade modified trypsin was purchased from PROMEGA (USA). Sulfo-NHS acetate and NHS-Activated agarose slurry were obtained from Pierce (USA). All other reagents? iodoacetamide, α-cyano-4-hydroxycinnamic acid (CHCA), and trifluoroacetic acid (TFA)?were purchased from Sigma-Aldrich (USA).
Stage 1 (NCI-H1703) and stage 4 non-small-cell lung cancer cells (NCI-H1755) were obtained from the Korean Cell Line Bank. Both lines were cultured in RPMI1640 (WelGENE, Korea) with 10% fetal bovine serum (Gibco, USA), 100 U/ml penicillin and 100 μg/ml streptomycin (Gibco, USA) and 25 mM HEPES (Gibco, USA). The cultures were maintained in 95% humidified air and 5% CO2 at 37°C.
To prepare the cell lysates, cells were grown to 80% confluence and lysed in strong SDS-based buffer, containing 4% SDS, 0.1 mM PMSF, 1× protease inhibitor cocktail, 0.1 M DTT, and 0.1 M HEPES. Lysates were incubated at 95°C for 5 min and sonicated for 1 min. Supernatants were collected from the lysates by centrifugation at 15,000 × g for 20 min at 4°C. Protein concentrations were measured using the BCA Protein Assay Kit ? reducing reagent-compatible (Pierce, USA). Finally, each cell lysate was stored in 0.2-mg aliquot at -80°C until use.
Cell lysates were processed by filter-aided sample preparation (FASP) (Wisniewski et al., 2009) using a 10 K molecular weight cutoff (MWCO) filter (Millipore, USA). Briefly, 200 ?g of cell lysates in lysis buffer (4% SDS, 0.1 mM PMSF, 1× protease inhibitor cocktail, 0.1 M DTT, and 0.1 M HEPES) was transferred to the filter and mixed with 0.2 ml 8 M urea in 0.1 M HEPES, pH 7.5 (FASP solution). Samples were centrifuged at 14,000 × g at 20°C for 20 min. The samples in the filter were diluted with 0.2 ml FASP solution and centrifuged again. The reduced cysteines remained in 0.1 ml 50 mM iodoacetamide in FASP solution, were incubated at room temperature (RT) in the darkn for 30 min, and centrifuged for 20 min.
For the label-free quantification, alkylated samples were mixed with 0.2 ml 50 mM Tris solution and centrifuged at 14,000 × g at 20°C for 20 min; this step was repeated 3 times. One hundred microliters 50 mM Tris solution with trypsin (enzyme: protein ratio 1:80) was added to the resulting concentrate and incubated for 16 h at 37°C. Peptides were collected from the filter by centrifugation for 20 min to new collection tubes and acidified with 2% TFA.
Alkylated samples were mixed with 0.1 ml 50 mM HEPES with Sulfo-NHS acetate (Sulfo-NHS acetate:protein ratio at 25:1) and incubated for 2 h at RT. The samples were centrifuged at 14,000 × g at 20°C for 20 min, mixed with 0.2 ml 1 M Tris solution, and incubated on the filter for 4 h at RT. The samples were then centrifuged at 14,000 × g at 20°C for 20 min 4 times. One hundred microliters 50 mM Tris solution with trypsin (enzyme: protein ratio of 1:80) was added to the filter and incubated for 16 h at 37°C. Digested peptides were collected by centrifugation and acidified with 2% TFA.
Digested samples were desalted using in-house C18 StageTip desalting (STD) columns, as described (Han et al., 2012). Briefly, in-house C18 STD columns were prepared by reversedphase packing of POROS 20 R2 material into 0.2-ml yellow pipet tips that sat atop C8 empore disk membranes. The STD columns were washed with 0.1 ml 100% methanol and with 0.1 ml 100% ACN 3 times and equilibrated 3 times with 0.1 ml 0.1% TFA. After the peptides were loaded, the STD columns were washed 3 times with 0.1 ml 0.1% TFA, and the peptides were eluted with 0.1 ml of a series of elution buffers, containing 0.1% TFA and 40, 60, and 80% ACN. All eluates were combined and dried in a vacuum centrifuge.
Dried samples were dissolved in bupHTM PBS (Pierce, USA). One milliliter of an NHS-agarose bead slurry (50% slurry in acetone) was prepared per the manufacturer’s protocol (Pierce, USA). Briefly, acetone was removed from the slurry by centrifugation, and the slurry was washed 2 times with water and equilibrated 3 times with bupHTM PBS. After mixing with the equilibrated beads, the labeled samples were incubated for 4 h at RT. Finally, the beads were centrifuged at 1,000 × g for 30 s, and the supernatant was transferred to new tubes, acidified with 2% TFA, and desalted again.
Bovine serum albumin (BSA) peptides (Amresco, USA) were Nterminally labeled as described above as control. The peptides were dissolved in 10 ?l 0.1% TFA, and 0.5 μl of each sample was mixed with 0.5 μ of a matrix solution that contained 5 mg/ml CHCA (Sigma, USA), 70% ACN, and 0.1% TFA. The peptides were spotted directly onto a MALDI plate (Opti-TOFTM 384-well Insert, Applied Biosystems, USA) and crystallized with the matrix. Dried peptides were analyzed on a 4800 MALDITOF/ TOFTM Analyzer (Applied Biosystems) that was equipped with a 355-nm Nd:YAG laser. The pressure in the TOF analyzer was approximately 7.6 × e-07 Torr.
The mass spectra were obtained in the reflectron mode over an m/z range of 800-3500 Da with an accelerating voltage of 20. External calibration was performed using des-Arg-Bradykinin (904,468 Da), angiotensin 1 (1,296.685 Da), Glu-Fibrinopeptide B (1,570.677 Da), adrenocorticotropic hormone (ACTH) (1-17) (2,093.087 Da), and ACTH (18-39) (2,465.199) (4700 calibration mixture, Applied Biosystems). Raw data were reported by 4000 SERIES EXPLORER, v4.4 (Applied Biosystems).
All peptide samples were analyzed on an LTQ-Orbitrap Velos mass spectrometer (Thermo Scientific, USA) that was coupled to an EasyLC II (Proxeon Biosystems, Denmark), equipped with a nanoelectrospray device and fitted with a 10-?m fused silica emitter tip (New Objective, USA). Ten microliters of each samples was loaded onto a nano-LC trap column (ZORBAX 300SB-C18, 5 μm, 0.3 × 5 mm, Agilent, USA), and peptides were separated on a C18 analytical column (75 μm × 15 cm) that was packed in-house with C18 resin (Magic C18-AQ 200 ?, 5-μm particles). Solvent A was 98% water with 0.1% formic acid and 2% ACN, and Solvent B was 98% ACN with 0.1% formic acid and 2% water.
Peptides were separated using a 180-min gradient at 300 nl/min, comprising 0% to 40% B for 120 min, 40% to 60% B for 20 min, 60% to 90% B for 10 min, 90% B for 10 min, 90% to 5% B for 10 min, and 0% B for 10 min. The spray voltage was set to 1.8 kV, and the temperature of the heated capillary was 200°C. The mass spectrometer scanned a mass range of 300 to 2000. The data on the top 10 most abundant ions were analyzed in data-dependent scan mode over a minimum threshold of 1000. The normalized collision energy was adjusted to 35%, and the dynamic exclusion was set to a repeat count of 1, repeat duration of 30 s, exclusion duration of 60 s, and ± 1.5 m/z exclusion mass width. Each biological replicate was analyzed in triplicate.
After the data acquisition, data searches were performed using SEQUEST Sorcerer (Sage-N Research, USA). Raw files from the LTQ-Orbitrap Velos were converted into mzXML files using Trans-Proteomics Pipeline (TPP, ISB, USA). MS/MS data were searched using a target decoy database strategy against a composite database that contained the International Protein Index (IPI) human database (v3.87, 91,464 entries), and its reverse sequences were generated using Scaffold 3 (Proteome Software Inc., USA).
For the label-free quantification dataset and N-terminal peptide data, 2 independent search parameters were used. Parameters for the label-free quantification dataset were as follows: enzyme, full-trypsin; peptide tolerance, 10 ppm; MS/MS tolerance, 1.0 Da; variable modifications, oxidation (M); and static modifications, carbamidomethylation (Cys). Identified proteins were filtered using Scaffold 3, based on a minimum of 2 unique peptides and false discovery rate (FDR) < 1%. The parameters for N-terminal peptide dataset were as follows: enzyme, semi-arginine; peptide tolerance, 10 ppm; MS/MS tolerance, 1.0 Da; variable modifications, oxidation (Met); and static modifications, carbamidomethylation (Cys) and acetylation (N-term and Lys). Peptide-spectrum matches were filtered to have less than a 1% FDR by calculating the statistics tool in TPP.
The label-free quantitative analysis of peptides was performed by spectral counting analysis. To calculate a protein spectrum count, we exported the numbers of peptides that were assigned to each protein from Scaffold 3. Exported data were analyzed by normalized spectral abundance factor (NSAF) method to normalize run-to-run variations (Zybailov et al., 2006). NSAF values were calculated as:
NSAF = (SpC / Mw) / ∑ (SpC / Mw) n
where SpC is the spectral count, Mw is the molecular weight in kDa, and n is the total number of proteins. Because some expression ratios that are calculated from spectral counts of 0, causing certain data to be represented as ‘#DIV/0!’ in Microsoft Office Excel 2010, we shifted the entire spectral count equally by adding 0.1 to the original values. By NSAF method, we could compare expression levels and apply independent 2-sample
Data were analyzed using various bioinformatics tools. To determine N-terminal peptide sites, we performed manual annotations using UniProtKB (Universal Protein Resource Knowledgebase) (
The biological process and molecular function classifications of identified proteins were analyzed using PANTHER ID numbers (
To differentiate the proteomic changes between primary and metastatic cells, whole-cell lysates of cultured human nonsmall-cell lung cancer cell lines (NCI-H1703 and NCI-H1755) were analyzed in parallel experiments, as depicted in Fig. 1. Each cell line was cultured as 3 independent biological replicates and prepared by FASP.
For the label-free quantitative proteomic analysis, cell lysates were digested with trypsin and desalted with a C18 in-house stage tip prior to LTQ-Orbitrap Velos analysis. To ensure the reliability of the quantitative profiling, each sample was injected in triplicate (3 technical replicates) for each biological replicate. A total of 18 raw files from the LTQ-Orbitrap Velos were processed in Scaffold 3 with the SEQUEST algorithm.
To analyze the N-terminal peptide data, free amines in the cell lysates were labeled by NHS-acetate. The remaining NHSacetate was quenched by the amine group of Tris. N-terminally labeled proteins were digested with trypsin and desalted using C18 in-house stage tips and filtered by NHS-activated beads that depleted the newly generated N-termini by trypsin. The superC18 in-house stage tips again. To profile the N-terminal peptides, the samples were analyzed in triplicate (3 technical replicates) for each biological replicate. A total of 18 raw data files were then processed in SEQUEST and TPP. All data from the wholecell lysates and N-terminal peptides were classified using informatics tools.
Samples were prepared by FASP, and LC-MS/MS analysis was performed using the LTQ-Orbitrap Velos. MS/MS data were acquired for the biological and technical triplicates for each cell line and processed to identify peptides that generated the observed spectra, and proteins were inferred, based on the identified peptides. Because the MS/MS spectral counts for peptides from shotgun proteomic approaches have recently been shown estimate protein abundance well, we performed a label-free quantitative analysis of NSCLC cell lines, based on a shotgun proteomics strategy and spectral counting techniques.
A total of 18 raw files from the 2 cell lines were combined into a single merged output file in Scaffold 3, in which the analysis was restricted to proteins with at least 2 unique peptides and an FDR < 0.5%. Per these criteria, we reproducibly identified 2130 non redundant proteins (Fig. 2A and Supplementary Table S1), 28% of which was identified by 2 unique peptides, whereas 17% was identified by 3 unique peptides, 11% was identified by 4 unique peptides, and 44% was identified by more than 5 unique peptides (Fig. 2B).
We classified all identified proteins by gene ontology (GO) analysis as biological process and molecular function. Many proteins mapped to the GO terms “protein metabolism and modification” (309 proteins), “intracellular protein traffic” (213 proteins), “protein biosynthesis” (147 proteins), “cell structure and motility” (147 proteins), and “cell cycle in biological process” (95 proteins) (Fig. 2C). Notably, molecular functions were assigned many proteins: 493 proteins were annotate with the GO term “nucleic acid binding,” 157 proteins were related to cytoskeletal protein,” 123 proteins fell under “dehydrogenase,” and 85 proteins were “membrane traffic proteins” (Fig. 2D) (Supplementary Table S1).
To quantify the identified proteins by spectral count, we used normalized spectral abundance factors (NSAF), with which the total number of spectra of an identified protein in each LCMS/ MS run correlates well with the abundance of the corresponding protein over a wide linear dynamic range (Zybailov et al., 2006). High-confidence proteins for label-free quantitation were selected with an average spectral count ≥ 5 in 9 datasets (3 technical and 3 biological replicate) in either cell line. Also, missing values from each dataset were exchanged with a value of 0. Of the 2130 identified proteins, 671 satisfied our label-free quantitative protein criteria (Supplementary Table S2).
The distribution of the ratio correlation between NCI-H1703 and NCI-H1755 in the 3 biological replicates was selectively plotted, as shown in Supplementary Fig. S1A, in which 3 distributions had high similarity. To determine the fold-change in expression for each protein between the 2 cell lines, the standard deviation of the 671 quantitative proteins were calculated for the 3 biological replicates, indicating that approximately 90% fell within 0.5 standard deviation (Supplementary Fig. S1B) (Kim et al., 2012). The differential expression ratios for the 671 protein groups are shown in Supplementary Fig. S1C, in which ratios ≥ 1.5-fold are shadowed. The expression of 242 proteins changed ≥ 1.5-fold between NCI-H1703 and NCI-H1755 cells; 92 proteins were upregulated, and 150 proteins were downregulated. For example, integrin alpha-2 (ITGA2), aldehyde dehydrogenase, mitochondrial (ALDH2), UDP-glucose 4-epimerase (GALE), and aldose reductase (AKR1B1) were preferentially expressed in NCI-H1755 cells. Conversely, alpha-internexin (INA), isoform 1 of myosin-10 (MYH10), isoform 3 of UDP-Nacetylhexosamine pyrophosphorylase (UAP1), and isoform 1 of protein AHNAK2 (AHNAK2) were significantly downregulated in NCI-H1755 cells (Table 1 and Supplementary Table S3).
The scheme with which N-terminal peptides were identified is shown in Fig. 3. The N-termini of proteins are characterized by an α-amine, as opposed to the ε-amines that are on lysine side chains. Thus, ε-amines on lysine side chains had to be blocked. We blocked the α-amine and ε-amine groups by acetylation using NHS-acetate. After a quenching step, the unbound NHSacetate was depleted by the amine in Tris. Next, proteins were digested with trypsin, generating N-terminal peptides with free amino groups. Then, we added NHS-activated beads, which bind free amine groups in newly generated N-terminal peptides by trypsin, whereas natural N-terminal peptides are blocked by acetylation (McDonald and Beynon, 2006).
In a control experiment, we examined whether this scheme could identify the natural N-termini of bovine serum albumin (BSA). Precursor BSA comprises 607 amino acids, whereas the mature form of BSA contains 583 amino acids, lacking residues 1-24 (Weijers, 1977). Thus, our BSA had an aspartic acid at residue 25 as its natural N-terminus.
Acetylated BSA was digested with trypsin and analyzed by MALDI-MS (Supplementary Fig. S2A). The observed peptide masses were consistent with the expected Arg-C-specific digestion of BSA (acetylated lysine is resistant to tryptic cleavage) and included the known N-terminal peptide (Ac-DTHK(ac)SEIAHR) at 1277.6 m/z. As expected, a range of lysine-containing peptides appeared, increasing by 42.03 Da per lysine. On removal of newly generated BSA peptides by tryptic digestion by NHS-activated beads, we detected a single major peak at 1277.6 m/z by mass spectrometry. The N-terminal peptide of BSA had 1 peak that was mass-shifted by the acetylation of α-amine and ε-amine and confirmed with the peptide fingerprint by MS/MS analysis (Supplementary Fig. S2B).
N-terminal peptides were identified in the 2 cell lines by positional proteomics analysis, as described (McDonald and Beynon, 2006). All samples were analyzed with 3 biological and technical replicates, and 307 unique proteins (272 peptides from 261 proteins in NCI-H1703 and 233 peptides from 220 proteins in NCI-H1755) were identified with more than 2 hits in the biological replicate analysis, with > 95% peptide probability and FDR < 1%. Ultimately, 92 unique N-terminal peptides were identified in NCI-H1703 cells compared to 53 in the NCI-H1755 cells (Supplementary Figs. S3A and S3B; Supplementary Table S4).
We analyzed the biological process and molecular function of the identified proteins. With regard to biological process, many proteins were enriched for the GO terms “protein metabolism and modification,” “protein biosynthesis,” and “mRNA splicing.” Many proteins mapped to the molecular function GO terms “nucleic acid binding” (62 proteins), “ribosomal protein” (30 proteins), and “chaperone in molecular function” (18 proteins) (Supplementary Figs. S3C and S3D).
The identified N-terminal peptides were divided into natural N-terminus and novel N-terminal neo peptides. Most proteins undergo systematic depletion of their natural N-termini to function. For example, certain proteins have their signal peptides excised from the N-terminus to be secreted. Thus, natural Ntermini were grouped into 5 types, based on molecule processing part of each protein sequence annotation in UniProtKB: initial methionine depletion, initial methionine nondepletion signal peptide depletion, propeptide depletion, and mitochondrial transit peptide depletion. Except for these natural N-termini, the newly identified peptides in the N-terminus analysis were annotated as novel N-terminal neo peptides that have not been assigned in the UniprotKB database.
A total of 325 unique N-terminal peptides were classified into 6 categories with regard to distributions of N-terminal peptides in NCI-H1703 and NCI-H1755 cells (Figs. 4A and 4B): (1) initial methionine depletion, NCI-H1703 (169 peptides, 62.1%) and NCI-H1755 (148 peptides, 63.5%); (2) initial methionine nondepletion, NCI-H1703 (37 peptides, 13.6%) and NCI-H1755 (28 peptides, 12.1%); (3) signal peptide depletion, NCI-H1703 (15 peptides, 5.5%) and NCI-H1755 (10 peptides, 4.3%); (4) propeptide depletion, NCI-H1703 (1 peptides, 0.4%) and NCIH1755 (1 peptides, 0.4%); (5) mitochondrial transit peptide depletion, NCI-H1703 (17 peptides, 6.3%) and NCI-H1755 (16 peptides, 6.9%); and (6) novel N-terminal neo peptide, NCIH1703 (33 peptides, 12.1%) and NCI-H1755 (30 peptides, 12.9%) (Supplementary Table S4).
We performed a pathway analysis of differentially expressed proteins and identified N-terminal peptides in the 2 cell lines. To define the related pathways, all proteins in the lists were subjected to KEGG pathway analysis (Supplementary Fig. S4). Fourteen proteins were involved in the focal adhesion pathway in relation of cell invasion, growth, proliferation, and migration (Supplementary Table S5), 5 of which (FLNA, FLNB, CAV1, MYL12B, and CAPN2) were common in the two parallel experiments. Three proteins?CRKL, PPP1CB, and MAPK3?were identified only in the N-terminal peptide analysis, and 6 proteins (VASP, VCL, RHOA, ACTN4, MAPK1, and ITGA2) appeared in the label-free quantitative analysis. Thirteen of the 14 focal adhesion proteins?except FLNA, which contained a novel Nterminal neo peptide (PATEKDLAEDAPWKKIQQNTFTR) in the NCI-H1703 and NCI-H1755 lines?showed differential expression in both cell lines in at least 1 experiments (Supplementary Table S5 and Fig. 5).
Six proteins (ITGA2, FLNA, FLNB, CAPN2, ACTN4, and MAPK1) were upregulated in metastatic lung cancer cells by label-free quantification analysis versus 3 downregulated proteins (RHOA, VASP, and VCL); 2 proteins (CAV1 and MY12B) were not differentially expressed. Three proteins (CRKL, PPP1CB, and MAPK3) were identified only in the N-terminal peptide analysis, in which we identified a fragment (novel Nterminal neo peptide) from CRKL in NCI-H1703 cells and methionine- depleted N-terminal peptides from PPP1CB and MAPK3 at the initial N-terminus. Protein phosphatase 1 (PPP1CB) is overexpressed in lung cancer (Liu et al., 2007) and is activated by phosphorylation. Although PPP1CB was detected by N-terminal peptide analysis only in NCI-H1755 cells, we excluded in subsequent analyses, due to the lack of phosphorylation data in this analysis.
Most NSCLC patients develop metastases, resulting in incurable disease at the time of diagnosis. Despite the advances in cancer research, there are few biomarkers for early-stage cancer, and our understanding of metastasis is poor (Tan et al., 2012). Also, metastasis has become the chief obstacle to the treatment of lung cancer. Thus, it will be helpful to determine the mechanisms of metastasis. To this end, our study has generated phenotypic data from primary and metastatic NSCLC using NCI-H1703 and NCI-H1755 cells, respectively.
Label-free quantitative analysis, based on MS1 peak intensities (Domon and Aebersold, 2006) and MS/MS spectral counts (Liu et al., 2004), is valuable in the large-scale analysis of proteins and peptides. General analysis of spectral counts has a limit of quantitation for low-abundance proteins (≤ 4 spectrum detected) and post translational modification proteins (Freund and Prenni, 2013). However, the analysis is suitable for detection of subtle abundance changes in most proteins with high sensitivity and reproducibility (Old et al., 2005).
In this study, we identified 2130 nonredundant proteins with 218,323 spectra by cell lysate profiling at a minimum of 2 distinct peptides per protein, based on an FDR of 0.3%. We also required 5 or more spectral counts for the identifications, for which spectral counts were normalized by NSAF. Lastly, 671 proteins were used for the label-free quantification, which allowed us to identify differentially expressed proteins (
Of the 242 differentially expressed proteins, transaldolase (TALDO1) is a novel serum biomarker for a model hepatocellular carcinoma (HCC) metastasis and HCC patients (Wang et al., 2011). TALDO1 was overexpressed in NCI-H1755 versus NCIH1703 cells. Dipanjana et al. reported global proteomic alterations in colorectal cancer cell metastasis, 8 proteins of which were consistent with our dataset; 3 upregulated proteins (ALDH2, HSP90B1, and PDIA4) and 5 downregulated proteins (EIF2S2, MCM6, MCM7, PSMC1, and PSMC2) (Ghosh et al., 2011).
Many proteins, such as isoform 2 of filamin-A (FLNA), isoform 1 of filamin-B (FLNB), isoform A of prelamin-A/C (LMNA), and vimentin (VIM), which were classified as the GO term “cell structure and motility,” were upregulated in the metastatic NCIH1755 line (Supplementary Table S1). In particular, LMNA is a metastatic biomarker of colorectal cancer cells (Willis et al., 2008) and a marker of embryonic stem cell differentiation (Constantinescu et al., 2006), although this status not been reported in NSCLC metastasis.
Cell proliferation molecules, such as isoform 1 of protein CDV3 homolog (CDV3), isoform 1 of epidermal growth factor receptor (EGFR), and histone-binding protein RBBP7 (RBBP7), were downregulated in the NCI-H1755 cells. Conversely, isoform 1 of annexin A7 (ANXA7), 60-kDa heat shock protein mitochondrial (HSPD1), proliferating cell nuclear antigen (PCNA), and isoform 3 of thioredoxin reductase 1 cytoplasmic (TXNRD1) were upregulated in this line. ANXA7 is a biomarker of progression in prostate and breast cancer (Srivastava et al., 2001); we also noted a 1.7-fold increase in NCI-H1755 cells.
Protein fragment reaction linked to cancer metastasis. Several studies have demonstrated that potential cancer biomarkers, such as HER2 rb2 and CYFRA 21-1, are generated by protein fragmentation (Pujol et al., 1993; Streckfus et al., 2000). For example, CYFRA 21-1 that is protein fragment is known relation with lung cancer metastasis, although it is not a specific marker for lung cancer diagnosis. In searching for markers that are elicited by protein fragmentation, we identified new generated N-terminal peptides using positional proteomics methods. In brief, natural N-termini are blocked by certain labeling methods, such as acetylation (McDonald and Beynon, 2006), dimethylation (Hsu et al., 2003), iTRAQ (Prudova et al., 2010), and PITC adman (Dugaiczyk et al., 1982). In our study, N-termini were labeled by acetylation, based on its simplicity and high labeling efficiency. Ultimately, we identified 27 novel N-terminal neo peptides that were differentially generated between metastatic cells and primary cancer cells. Notably, natural cleavage of Nterminal peptides, such as initial methionine depletion, signal peptide depletion, propeptide depletion, and transit peptide depletion, were also detected and annotated using the Uniprot database (Apweiler et al., 2004). Specifically, of the initial methionine- depleted proteins, we identified 44 proteins that do not exist in the UniprotKB database.
In the N-terminal peptide analysis, 92 peptides from 87 pro-teins were detected in NCI-H1703 cells, whereas 53 peptides from 46 proteins were identified in NCI-H1755 cells (Supplementary Fig. S3)?27 peptides were categorized as novel Nterminal neo peptides (like the fragment peptides), and 15 novel N-terminal neo peptides appeared only in NCI-H1703 cells. Notably, EPH receptor A2 (EPHA2) is a marker of NSCLC progression (Brannan et al., 2009), and a novel N-terminal neo peptide of EPHA2 was detected in primary cancer cells. However, EPHA2 was observed in both cell lines by label-free quantitative analysis (not used for quantification due to a spectral count below 5).
Five proteins were identified with fragment N-terminal peptides, whereas their expression did not differ by label-free quantification analysis (Table 2). Four of them?DDX3X, RPL4, RPL30, and XRCC6?were observed only in NCI-H1703 cells by N-terminal peptide analysis, whereas SHMT2 was detected only in NCI-H1755 cells. Further, four proteins (DDX3X, RPL4, RPL30, and XRCC6) are associated with cell proliferation and differentiation in metastasis (Bauer et al., 2012; Li et al., 2011; Yoon et al., 2006). In this study, the four proteins that were identified with novel N-terminal neo peptides were expressed in equal amounts in the cell lines, but they could not affect the metastasis of primary cancer cells (NCI-H1703).
We found 138 proteins that were common to both experiments (Supplementary Table S6). Most proteins, including natural N-terminal peptides that were differentially identified by Nterminal analysis, except for histone-binding protein RBBP7 (RBBP7), were consistent with their expression levels in the label-free quantification analysis. For example, creatine kinase B-type (CKB) was identified with initial methionine-depleted Ntermini only in NCI-H1703 cells by N-terminal analysis, whereas CKB was significantly upregulated in NCI-H1703 cells by labelfree quantitative analysis.
In the classification of the 138 commonly identified proteins by KEGG pathway, the proteins were primarily involved in aminoacyl- tRNA biosynthesis, the pentose phosphate pathway, the proteasome, arginine and proline metabolism, DNA replication, and focal adhesion (Supplementary Fig. S4). Focal adhesion is a major pathway of cancer metastasis, and we identified 15 proteins that were related to focal adhesion in the 2 profiling experiments (Fig. 5 and Supplementary Table S5). Of the 138 proteins, 11 proteins, identified by label-free quantification analysis, participated in focal adhesion?6 proteins were upregulated, 3 proteins were downregulated, and 2 proteins were not differentially expressed. Conversely, of the proteins that were identified by N-terminal peptide analysis, 8 were involved in focal adhesion.
Integrin alpha-2 (ITGA2) was upregulated by 2.4-fold in NCIH1755 cells. Apparently, ITGA2 mediates metastasis to the liver by regulating the focal adhesion pathway (Yoshimura et al., 2009). Overexpression of integrin proteins (ITGA and ITGB) initiates a signaling cascade to alpha-actinin-4 (ACTN4), FLNA, FLNB, and FAK (not identified in our data) to effect cell proliferation and growth (Shibue and Weinberg, 2009) (Fig. 5). Notably, ACTN4, FLNA, and FLNB were overexpressed in NCIH1755 cells in this study. In addition, MAPK1 (also known as ERK2), upregulated in metastatic cells, is a point at which multiple biochemical signals integrate (Wu et al., 2008) (Fig. 5).
MAP kinases mediate many processes in cancer cells, such as proliferation, migration, invasion, and metastasis (Obchoei et al., 2011). Increased expression of MAPK1 promotes the expression of CAPN2, which functions in cell movement, migration, and invasion during metastasis (Storr et al., 2011). In the N-terminal peptide analysis, v-crk sarcoma virus CT10 oncogene homolog (avian)-like (CRKL) was identified as a novel Nterminal neo peptide only in NCI-H1703 cells. Because CRKL activates ERK signaling to promote cell proliferation, survival, and invasion in lung cancer (Kim et al., 2010), we hypothesize that CRKL function is regulated by fragment events during metastasis.
In summary, we have identified differentially expressed proteins that distinguish primary and metastatic lung cancer. Many of these quantitative proteins and N-terminal peptides are involved in pathways in cell migration, proliferation, and metastasis. Thus, our datasets of proteins and fragment peptides in lung cells might be valuable in discovering and validating lung cancer biomarkers and metastasis markers.
. Top 15
IPIa | MW (kDa) | Ratiob | p-valuec | Gene symbol | Protein name |
---|---|---|---|---|---|
IPI00013744 | 129.3 | 6.6 | 0.0001 | ITGA2 | Integrin alpha-2 |
IPI00006663 | 56.4 | 6.57 | 0.0004 | ALDH2 | Aldehyde dehydrogenase, mitochondrial |
IPI00553131 | 38.3 | 5.9 | 0.0003 | GALE | UDP-glucose 4-epimerase |
IPI00413641 | 35.9 | 3.81 | 0.0022 | AKR1B1 | Aldose reductase |
IPI00216008 | 62.5 | 3.35 | 0.0023 | G6PD | Isoform Long of Glucose-6-phosphate 1-dehydrogenase |
IPI00017376 | 86.5 | 2.86 | 0.0015 | SEC23B | Protein transport protein Sec23B |
IPI00215743 | 152.5 | 2.6 | 0.0001 | RRBP1 | Isoform 3 of Ribosome-binding protein 1 |
IPI00001539 | 41.9 | 2.38 | 0.0009 | ACAA2 | 3-ketoacyl-CoA thiolase, mitochondrial |
IPI00292771 | 238.3 | 2.23 | 0.0018 | NUMA1 | Isoform 1 of Nuclear mitotic apparatus protein 1 |
IPI00744692 | 37.5 | 2.22 | 0.0000 | TALDO1 | Transaldolase |
IPI00643920 | 68.8 | 2.04 | 0.0016 | TKT | cDNA FLJ54957, highly similar to Transketolase |
IPI00414717 | 134.6 | 2.03 | 0.0223 | GLG1 | Isoform 2 of Golgi apparatus protein 1 |
IPI00219525 | 51.9 | 2.01 | 0.0001 | PGD | 6-phosphogluconate dehydrogenase, decarboxylating |
IPI00027223 | 46.6605 | 2.01 | 0.0010 | IDH1 | Isocitrate dehydrogenase [NADP] cytoplasmic |
IPI00003479 | 41.3919 | 2 | 0.0082 | MAPK1 | Mitogen-activated protein kinase 1 |
IPI00001453 | 55.4 | -7.47 | 0.0055 | INA | Alpha-internexin |
IPI00397526 | 230.8 | -6.87 | 0.0039 | MYH10 | Isoform 1 of Myosin-10 |
IPI00607787 | 58.7 | -6.64 | 0.0041 | UAP1 | Isoform 3 of UDP-N-acetylhexosamine pyrophosphorylase |
IPI00856045 | 616.6 | -6.59 | 0.0028 | AHNAK2 | Isoform 1 of Protein AHNAK2 |
IPI00333619 | 54.8 | -6.52 | 0.0027 | ALDH3A2 | Isoform 1 of Fatty aldehyde dehydrogenase |
IPI00178150 | 139.9 | -6.36 | 0.0083 | KIF4A | Isoform 1 of Chromosome-associated kinesin KIF4A |
IPI00237884 | 181.0 | -6.26 | 0.0416 | AKAP12 | Isoform 1 of A-kinase anchor protein 12 |
IPI00218775 | 51.2 | -6.12 | 0.0051 | FKBP5 | Peptidyl-prolyl cis-trans isomerase FKBP5 |
IPI00023972 | 50.6 | -6.11 | 0.0041 | DDX47 | Probable ATP-dependent RNA helicase DDX47 |
IPI00003505 | 48.6 | -5.96 | 0.0039 | TRIP13 | Isoform 1 of Pachytene checkpoint protein 2 homolog |
IPI00396627 | 92.1 | -5.95 | 0.0118 | ELAC2 | Isoform 1 of Zinc phosphodiesterase ELAC protein 2 |
IPI00022977 | 42.6 | -5.89 | 0.0085 | CKB | Creatine kinase B-type |
IPI00294187 | 75.6 | -5.89 | 0.0008 | PADI2 | Protein-arginine deiminase type-2 |
IPI00017303 | 104.7 | -5.89 | 0.0201 | MSH2 | DNA mismatch repair protein Msh2 |
IPI00218922 | 88.0 | -5.77 | 0.0110 | SEC63 | Translocation protein SEC63 homolog |
aIPI accession number of each protein
bSignificant difference expression log2 ratio of NCI-H1755/NCI-H1703 with NSAF value
cSignificant difference in t-test (
. Proteolytic events identified with less than 1.5 fold change.
IPI | Peptide sequencea | Ratiob | N-terminal analysisc | Gene symbol | Protein name |
---|---|---|---|---|---|
IPI00215637 | N. | -0.48 | NCI-H1703 | DDX3X | ATP-dependent RNA helicase DDX3X |
IPI00003918 | R. | -0.37 | NCI-H1703 | RPL4 | 60S ribosomal protein L4 |
IPI00219156 | V. | -0.15 | NCI-H1703 | RPL30 | 60S ribosomal protein L30 |
IPI00644712 | R. | 0.14 | NCI-H1703 | XRCC6 | X-ray repair cross-complementing protein 6 |
IPI00002520 | Q. | 0.30 | NCI-H1755 | SHMT2 | Serine hydroxymethyltransferase, mitochondrial |
aObserved peptide sequence from N-terminal peptide analysis is written by italics.
bExpression log2 ratio of NCI-H1755/NCI-H1703 with NSAF value by label-free analysis
cCell line with detected peptide sequences from N-terminal analysis
Christopher J. Occhiuto, Jessica A. Moerland, Ana S. Leal, Kathleen A. Gallo, and Karen T. Liby
Mol. Cells 2023; 46(3): 176-186 https://doi.org/10.14348/molcells.2023.2191Min Ji Park, Eunji Jeong, Eun Ji Lee, Hyeon Ji Choi, Bo Hyun Moon, Keunsoo Kang, and Suhwan Chang
Mol. Cells 2023; 46(6): 351-359 https://doi.org/10.14348/molcells.2023.2174Cho-Won Kim, Hong Kyu Lee, Min-Woo Nam, Youngdong Choi, and Kyung-Chul Choi
Mol. Cells 2022; 45(12): 935-949 https://doi.org/10.14348/molcells.2022.0105