Background: Molecular heterogeneity of tumors suggests the presence of multiple different subclones that may limit response to targeted therapies and contribute to acquisition of drug resistance, but its quantification has remained challenging.
Results: We performed simulations to evaluate statistical measures that best capture the molecular diversity within a group of tumors for either continuous (gene expression) or discrete (mutations, copy number alterations) molecular data. Dispersion based metrics in the principal component space best captured the underlying heterogeneity. To demonstrate utility of these measures, we characterized the diversity in transcriptional and genomic profiles of different breast tumor subtypes, and showed that basal-like or triple-negative breast cancers (TNBC) are significantly more heterogeneous molecularly than other subtypes. Our analysis also suggests that transcriptional diversity is a global characteristic of the tumors observed across the majority of molecular pathways. Among basal-like tumors, those that were resistant to multi-agent chemotherapy showed greater transcriptional diversity compared to chemotherapy-sensitive tumors, suggesting that potentially multiple mechanisms may be contributing to chemotherapy resistance.
Conclusions: We proposed and validated measures of transcriptional and genomic diversity that can quantify the molecular diversity of tumors. We applied the new measures to genomic data from breast tumors and demonstrated that basal-like breast cancers are significantly more diverse than other breast cancers. The observation that chemo-resistant tumors are significantly more diverse molecularly than chemosensitive tumors implies that multiple resistance mechanisms may be active, thus limiting the sensitivity and accuracy of predictive markers of chemotherapy response.