Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 4 (7), e6146

Molecular Profiling of Breast Cancer Cell Lines Defines Relevant Tumor Models and Provides a Resource for Cancer Gene Discovery


Molecular Profiling of Breast Cancer Cell Lines Defines Relevant Tumor Models and Provides a Resource for Cancer Gene Discovery

Jessica Kao et al. PLoS One.


Background: Breast cancer cell lines have been used widely to investigate breast cancer pathobiology and new therapies. Breast cancer is a molecularly heterogeneous disease, and it is important to understand how well and which cell lines best model that diversity. In particular, microarray studies have identified molecular subtypes-luminal A, luminal B, ERBB2-associated, basal-like and normal-like-with characteristic gene-expression patterns and underlying DNA copy number alterations (CNAs). Here, we studied a collection of breast cancer cell lines to catalog molecular profiles and to assess their relation to breast cancer subtypes.

Methods: Whole-genome DNA microarrays were used to profile gene expression and CNAs in a collection of 52 widely-used breast cancer cell lines, and comparisons were made to existing profiles of primary breast tumors. Hierarchical clustering was used to identify gene-expression subtypes, and Gene Set Enrichment Analysis (GSEA) to discover biological features of those subtypes. Genomic and transcriptional profiles were integrated to discover within high-amplitude CNAs candidate cancer genes with coordinately altered gene copy number and expression.

Findings: Transcriptional profiling of breast cancer cell lines identified one luminal and two basal-like (A and B) subtypes. Luminal lines displayed an estrogen receptor (ER) signature and resembled luminal-A/B tumors, basal-A lines were associated with ETS-pathway and BRCA1 signatures and resembled basal-like tumors, and basal-B lines displayed mesenchymal and stem/progenitor-cell characteristics. Compared to tumors, cell lines exhibited similar patterns of CNA, but an overall higher complexity of CNA (genetically simple luminal-A tumors were not represented), and only partial conservation of subtype-specific CNAs. We identified 80 high-level DNA amplifications and 13 multi-copy deletions, and the resident genes with concomitantly altered gene-expression, highlighting known and novel candidate breast cancer genes.

Conclusions: Overall, breast cancer cell lines were genetically more complex than tumors, but retained expression patterns with relevance to the luminal-basal subtype distinction. The compendium of molecular profiles defines cell lines suitable for investigations of subtype-specific pathobiology, cancer stem cell biology, biomarkers and therapies, and provides a resource for discovery of new breast cancer genes.

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.


Figure 1
Figure 1. Clustering of expression profiles defines breast cancer cell line subtypes.
(A) Thumbnail “heatmap” of two-way hierarchical clustering of 50 breast cancer cell lines (columns) and 8,750 variably expressed genes (rows) (data available as Table S1). Gene expression ratios are depicted by log2 pseudocolor scale shown; gray represents poorly measured data. (B) Enlarged view of the sample dendrogram. Clustering stratifies cell lines into two main groups, luminal (blue dendrogram branches) and basal, the latter further subdivided into two subgroups, basal A (red) and basal B (orange). (C–I) Selected gene expression patterns extracted from the cluster; corresponding locations in the thumbnail are indicated by the vertical colored bars. (C) Basal-B; (D) Basal cytokeratins; (E) Basal; (F) Basal-A; (G) Luminal cytokeratins; (H) ER-associated; (I) Luminal differentiation.
Figure 2
Figure 2. Subtype-specific expression and molecular characteristics.
(A) Clinical, pathological and molecular characteristics of cell line expression subtypes. Black boxes indicate metastasis derivation, ER-positivity, TP53 mutation, ERBB2/HER2 positivity, PTEN mutation, PIK3CA mutation. Mutation data compiled from the Sanger ( and IARC ( websites, and from refs. , . White cross-hatched boxes indicate missing data. (B) Classification of cell lines by nearest resemblance to tumor gene-expression subtype: luminal A (dark blue), luminal B (light blue), ERBB2-associated (purple), basal-like (red) or normal-like (green); and by positivity (black boxes) for 70-gene, wound and hypoxia signature. (C) Expression levels of selected stem/progenitor cell relevant markers; log2 ratios are depicted by pseudocolor scale shown (gray represents poorly measured data). (D) Relation of tumor subtypes to cell line subtypes. Subtype of 86 tumors is shown color-coded as above. Resemblance to each cell line subtype is depicted by Euclidian distance, indicated by blue intensity (representing shorter distances); best match is bracketed in black.
Figure 3
Figure 3. Genomic profiles define spectra of CNAs in cell line subtypes.
(A) Spectra of gains (red) and losses (green) across the genome, plotted as average log2 ratio, for 89 breast tumors , above, compared to the set of 50 cell lines (profiled for both expression and CNAs), below. (B) Spectra of gains and losses for the cell line subtypes: luminal (above), basal A (middle) and basal B (below). Statistically significant subtype-specific CNAs, called by SAM (FDR<5%), are marked by a black bar. The subset of those loci that also characterize the corresponding primary breast tumor subtype is marked by an asterisk.
Figure 4
Figure 4. Cell line subtypes exhibit distinct genomic instabilities.
Fraction of genome comprising (A) high-level DNA amplification; or (B) low-level gain/loss, stratified by cell line subtype (luminal, basal-A, basal-B). Box plots show 25th, 50th and 75th percentiles; P-values (Students t-test) for pairwise comparisons are shown.

Similar articles

See all similar articles

Cited by 321 PubMed Central articles

See all "Cited by" articles


    1. Subramaniam DS, Isaacs C. Utilizing prognostic and predictive factors in breast cancer. Curr Treat Options Oncol. 2005;6:147–159. - PubMed
    1. Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, et al. Molecular portraits of human breast tumours. Nature. 2000;406:747–752. - PubMed
    1. Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A. 2001;98:10869–10874. - PMC - PubMed
    1. Bergamaschi A, Kim YH, Wang P, Sorlie T, Hernandez-Boussard T, et al. Distinct patterns of DNA copy number alteration are associated with different clinicopathological features and gene-expression subtypes of breast cancer. Genes Chromosomes Cancer. 2006;45:1033–1040. - PubMed
    1. Chin K, DeVries S, Fridlyand J, Spellman PT, Roydasgupta R, et al. Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. Cancer Cell. 2006;10:529–541. - PubMed

Publication types