Linking patient outcome to high throughput protein expression data identifies novel regulators of colorectal adenocarcinoma aggressiveness

F1000Res. 2015 Apr 24;4:99. doi: 10.12688/f1000research.6388.1. eCollection 2015.


A key question in cancer systems biology is how to use molecular data to predict the biological behavior of tumors from individual patients. While genomics data have been heavily used, protein signaling data are more directly connected to biological phenotype and might predict cancer phenotypes such as invasion, metastasis, and patient survival. In this study, we mined publicly available data for colorectal adenocarcinoma from the Cancer Genome Atlas and identified protein expression and signaling changes that are statistically associated with patient outcome. Our analysis identified a number of known and potentially new regulators of colorectal cancer. High levels of insulin growth factor binding protein 2 (IGFBP2) were associated with both recurrence and death, and this was validated by immunohistochemical staining of a tissue microarray for a secondary patient dataset. Interestingly, GATA binding protein 3 (GATA3) was the protein most frequently associated with death in our analysis, and GATA3 expression was significantly decreased in tumor samples from stage I-II deceased patients. Experimental studies using engineered colon cancer cell lines show that exogenous expression of GATA3 decreases three-dimensional colony growth and invasiveness of colon cancer cells but does not affect two-dimensional proliferation. These findings suggest that protein data are useful for biomarker discovery and identify GATA3 as a regulator of colorectal cancer aggressiveness.

Keywords: Bioinformatics; Cancer Biology; Colorectal Cancer; Prognosis; Proteomics; Reverse Phase Protein Array; TCGA.