Background: Conventional renal cell carcinoma (cRCC) accounts for most of the deaths due to kidney cancer. Tumor stage, grade, and patient performance status are used currently to predict survival after surgery. Our goal was to identify gene expression features, using comprehensive gene expression profiling, that correlate with survival.
Methods and findings: Gene expression profiles were determined in 177 primary cRCCs using DNA microarrays. Unsupervised hierarchical clustering analysis segregated cRCC into five gene expression subgroups. Expression subgroup was correlated with survival in long-term follow-up and was independent of grade, stage, and performance status. The tumors were then divided evenly into training and test sets that were balanced for grade, stage, performance status, and length of follow-up. A semisupervised learning algorithm (supervised principal components analysis) was applied to identify transcripts whose expression was associated with survival in the training set, and the performance of this gene expression-based survival predictor was assessed using the test set. With this method, we identified 259 genes that accurately predicted disease-specific survival among patients in the independent validation group (p < 0.001). In multivariate analysis, the gene expression predictor was a strong predictor of survival independent of tumor stage, grade, and performance status (p < 0.001).
Conclusions: cRCC displays molecular heterogeneity and can be separated into gene expression subgroups that correlate with survival after surgery. We have identified a set of 259 genes that predict survival after surgery independent of clinical prognostic factors.