Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2017 Oct;153(4):1082-1095.
doi: 10.1053/j.gastro.2017.06.008. Epub 2017 Jun 16.

Colorectal Cancer Cell Line Proteomes Are Representative of Primary Tumors and Predict Drug Sensitivity

Free PMC article
Comparative Study

Colorectal Cancer Cell Line Proteomes Are Representative of Primary Tumors and Predict Drug Sensitivity

Jing Wang et al. Gastroenterology. .
Free PMC article


Background and aims: Proteomics holds promise for individualizing cancer treatment. We analyzed to what extent the proteomic landscape of human colorectal cancer (CRC) is maintained in established CRC cell lines and the utility of proteomics for predicting therapeutic responses.

Methods: Proteomic and transcriptomic analyses were performed on 44 CRC cell lines, compared against primary CRCs (n=95) and normal tissues (n=60), and integrated with genomic and drug sensitivity data.

Results: Cell lines mirrored the proteomic aberrations of primary tumors, in particular for intrinsic programs. Tumor relationships of protein expression with DNA copy number aberrations and signatures of post-transcriptional regulation were recapitulated in cell lines. The 5 proteomic subtypes previously identified in tumors were represented among cell lines. Nonetheless, systematic differences between cell line and tumor proteomes were apparent, attributable to stroma, extrinsic signaling, and growth conditions. Contribution of tumor stroma obscured signatures of DNA mismatch repair identified in cell lines with a hypermutation phenotype. Global proteomic data showed improved utility for predicting both known drug-target relationships and overall drug sensitivity as compared with genomic or transcriptomic measurements. Inhibition of targetable proteins associated with drug responses further identified corresponding synergistic or antagonistic drug combinations. Our data provide evidence for CRC proteomic subtype-specific drug responses.

Conclusions: Proteomes of established CRC cell line are representative of primary tumors. Proteomic data tend to exhibit improved prediction of drug sensitivity as compared with genomic and transcriptomic profiles. Our integrative proteogenomic analysis highlights the potential of proteome profiling to inform personalized cancer medicine.

Keywords: Cell Lines; Colorectal Cancer; Drug Sensitivity; Proteomics.

Conflict of interest statement


No conflict of interest or competing financial interests to disclose for all authors.


Figure 1
Figure 1. Comparison of protein abundances between CRC cell lines and tumors
(a) Volcano plot indicating proteins overexpressed in cell lines (blue) or tumors (red) (FDR<5% and fold change>2); other genes are colored in grey. (b) The GO Biological Processes (BP) enriched for proteins overexpressed in cell lines (blue) or tumors (red) identified using WebGestalt . (c) Overlap of stroma signatures with genes overexpressed in tumors versus other genes. p value for hypergeometric test. (d) Distributions of the signed -log10 p values (voom/limma) of the associations between protein abundance and tumor purity score for genes overexpressed in tumors versus other genes. p value for Wilcox rank sum test. (e) Heatmap of tumor stroma and epithelial protein marker expression in tumors and cell lines. The bar plot to the left of the heatmap represents the signed -log10 FDR (voom/limma) comparing protein abundances of tumor and cell line samples. (f) Box plots comparing protein abundance measurements for cell lines and tumors against tumor-cell specific IHC scores defined by the Human Protein Atlas. p values for Jonckheere’s trend test.
Figure 2
Figure 2. Pathways associated with the hypermutation phenotype in CRC cell lines and tumors
(a) GSEA enrichment scores for significant KEGG pathways in cell lines and tumors. Red and blue bars represent the positively and negatively enriched pathways, respectively. The numbers in the parentheses represent the enriched FDR of the pathways. (b) Genes sorted by differential expression between hypermutated and non-hypermutated samples. Red and green represent overexpression in hypermutated and non-hypermutated samples, respectively. Bars in the bottom panel represent genes annotated to the mismatch repair pathway with blue bars indicating the leading-edge genes reported by GSEA. (c) Comparison of protein abundance between hypermutated and non-hypermutated samples for the leading-edge genes identified from the cell line data.
Figure 3
Figure 3. Comparison of the correlations between mRNA and protein abundance in tumor and cell line data
(a) Correlations between steady state mRNA and protein abundance across genes within individual samples. (b) Correlations between mRNA and protein variation across cell line or tumor samples for each gene. (c) GSEA KEGG enrichment for average differences in mRNA-protein ranks across genes in both the cell line and tumor data. Genes colored in red are ranked higher in RNA, genes in green ranked higher in proteomics and blue are the leading-edge GSEA genes.
Figure 4
Figure 4. Comparison of cell lines and tumors to normal tissues based on protein abundance data
(a) Correlation of protein expression changes for cell line and tumor relative to normal tissue. (b) Overlap between up-regulated and down-regulated proteins (FDR<0.05, fold change>2) relative to normal. (c) Heat map showing protein expression in normal, tumor and cell line samples. (d) Coordinated protein expression changes within KEGG pathways determined using a linear mixed-effects model. Mean log fold change as compared to normal and heatmap of pathway expression shown for normal, tumor and cell line samples.
Figure 5
Figure 5. Proteome alterations associated with copy-number aberrations
(a) DNA copy-number spectra (% gain = red bars, % loss = blue bars, relative to ploidy) in cell lines and tumors. (b) Strengths of association for protein expression with corresponding DNA copy-number changes (-log10(FDR)). Grey = not significantly associated with copy number alterations, blue = significant across proteomics cell line and tumor data only, green = significant for both proteomics and mRNA expression across cell line and tumor, red = candidate tumor suppressor and oncogenes.
Figure 6
Figure 6. Proteomics data utility for predicting therapeutic responses
(a, b) Associations of proteomic, mutation, DNA copy number and mRNA data with (a) established drug-target associations and (b) drug-pathway associations. Associations are shown for drug-target gene associations quantifiable at the protein level and significant in at least one of the four modalities as signed -log10(FDR) values from voom/limma and GSEA analyses, respectively. (c) Comparison of the utility of four omic modalities to predict drug sensitivity for 5-fluoruracil (5-FU), erlotinib, oxaliplatin, regorafenib and SN-38: proteomic data (red); RNA-Seq data (blue); CNA data (green); and exome mutation data (yellow). For each drug-omic modality combination, area under the receiver operating characteristic curve (AUROCs) were generated from 100 times of 5-fold cross-validations. The two-sided Wilcoxon rank sum test was used to compare the performance between protein-based models and models based on other omics data types. For each comparison, the p value is colored based on the color of the omic data type with significantly better performance. (d–e) Pharmacological targeting of proteins associated with resistance or sensitivity to (d) 5-FU or (e) SN-38. Bliss excess values are shown for drug combinations with 5-FU (at IC30 concentration) and SN-38 (at IC40 concentration) in HCT116 cells. The protein targets were restricted to those with FDR< 0.2 from the relevant voom/limma calculation; drugs are detailed in Supplementary Tables 4–5. p-values for Student’s t-test. (f) Dose-response plots for selected compounds alone (black), with either a 5-FU or SN-38 (blue), or the predicted response under the assumption of Bliss independence for the two compounds (green). Bliss synergy = blue line below green line; Bliss antagonism = blue line above green line.
Figure 7
Figure 7. Concordance of proteomic CRC subtypes in cell lines and tumors
(a) Heatmap of protein abundances indicating proteomic subtypes for tumors (left panel) and cell lines (right panel). Samples are arranged along the X axis and genes are arranged along the Y axis. Increased expression (red) and decreased expression (blue) relative to the mean-centered and scaled expression of the gene (normalized CPM) across the samples. (b) Representation of genomic hallmarks among proteomic subtypes. (c) Drug responses of proteomic subtypes to 5-fluoruracil (5-FU), erlotinib, oxaliplatin, regorafenib and SN-38 treatment, and relationships with cell doubling time. Puni (univariate) is the P-value obtained from univariate ANOVA, and Padj (adjusted) is the P-value from two-way ANOVA adjusting for cell doubling time.

Similar articles

See all similar articles

Cited by 12 articles

See all "Cited by" articles

Publication types

MeSH terms