Background: The NCI has undertaken a twenty-year project to characterize compound sensitivity patterns in a selected set of sixty tumor derived cell lines. Previous studies have explored the relationship between compound sensitivity patterns to gene expression, protein expression, and DNA copy number for these same cell lines. A strong correlation between the pattern of expression of a biomarker and sensitivity to a compound could suggest a clinically interesting biological relationship between the two.
Results: We isolated RNA's and measured expression of 40000 genes using cDNA microarrays from the fifty-nine publicly available cell lines. Analysis of this data set in comparison with published gene expression data sets demonstrates a high degree of reproducibility in expression level measurements even using completely independent RNA preparations and array technologies. Using the fifty-nine cell lines for discovery and an additional seven cell lines for which extensive compound sensitivity data were available as a test set, we determined that gene-compound pairs with a correlation coefficient above 0.6 had a false discovery rate of approximately 5%. Large scale features of the gene expression and chemosensitivity data, such as tissue of origin and other physiological factors, did not seem to explain the majority of correlations between gene and compound patterns.
Conclusion: A comparison of gene expression and compound sensitivity in panels of cell lines was demonstrated to have a relatively high validation and low false discovery rate supporting the use of this approach and datasets for identifying candidate biomarkers and targeted biologically active compounds.