Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation


Primary triple-negative breast cancers (TNBCs), a tumour type defined by lack of oestrogen receptor, progesterone receptor and ERBB2 gene amplification, represent approximately 16% of all breast cancers. Here we show in 104 TNBC cases that at the time of diagnosis these cancers exhibit a wide and continuous spectrum of genomic evolution, with some having only a handful of coding somatic aberrations in a few pathways, whereas others contain hundreds of coding somatic mutations. High-throughput RNA sequencing (RNA-seq) revealed that only approximately 36% of mutations are expressed. Using deep re-sequencing measurements of allelic abundance for 2,414 somatic mutations, we determine for the first time-to our knowledge-in an epithelial tumour subtype, the relative abundance of clonal frequencies among cases representative of the population. We show that TNBCs vary widely in their clonal frequencies at the time of diagnosis, with the basal subtype of TNBC showing more variation than non-basal TNBC. Although p53 (also known as TP53), PIK3CA and PTEN somatic mutations seem to be clonally dominant compared to other genes, in some tumours their clonal frequencies are incompatible with founder status. Mutations in cytoskeletal, cell shape and motility proteins occurred at lower clonal frequencies, suggesting that they occurred later during tumour progression. Taken together, our results show that understanding the biology and therapeutic responses of patients with TNBC will require the determination of individual tumour clonal genotypes.

Conflict of interest statement

Competing Interests The authors declare that they have no competing financial interests.


Figure 1
Figure 1
Distribution of number of validated somatic mutations by case over 65 cases. (a) Mutation frequency (Basal (red), Other (gray)). Patients harbouring known driver gene mutations are indicated. (b) Case specific and overall (inset) distributions of mutations in CNA classes: HOMD (homozygous deletion), HETD (hemizygous deletion), NEUT (no copy number change), GAIN (single copy gain), AMP (amplification) and HLAMP (high-level amplification). The number of (HOMD, HLAMP) CNAs (black diamonds) and percentage genome altered (green circles) are indicated. (c) Case specific and overall (inset) distributions of mutations in expression classes: Not (no expression), WT (wildtype expression), Het (mutant and wildtype expression) and Hom (dominant mutant expression).
Figure 2
Figure 2
Population patterns of co-occurrence and mutual exclusion of genomic aberrations in TNBC. (a) Case-specific mutations in known driver genes, plus genes from integrin signaling and ECM related proteins (laminins, collagens, integrins, myosins and dynein) derived from all aberration types: high-level amplifications (HLAMP), homozygous deletions (HOMD), missense, truncating, splice site and indel somatic mutations are depicted in genes with at least two aberrations in the population. (b) Distribution of somatic mutations in 25 genes across all exons of 159 additional breast cancers (relative proportion of ER+ cases in green, and ER- in blue), shown as a percentage of cases with one or more mutations.
Figure 3
Figure 3
Network analysis of recurrently mutated genes by somatic point mutations and indels (254 genes). (a) Significantly over-represented pathways (FDR < 0.001) from recurrently mutated genes (see Supplemental methods). Node shading encodes the adjusted p-value (q-value) of the comparison of the distribution of clonal frequencies of mutations in a given pathway to the overall distribution of clonal frequencies. A spectrum of higher (red) and lower (yellow) clonal frequencies is evident. (b) Case-specific mutations shaded according to clonal frequencies in known driver genes, plus genes from integrin signaling and ECM related proteins (laminins, collagens, integrins, myosins and dyneins).
Figure 4
Figure 4
Clonal evolution in TNBC. (a) Schematic representation of integration of CNA, LOH, allelic abundance measurements and normal cell contamination for clonal frequency estimation (left). Example of a mixture of three clonal genotypes and their resulting clonal frequencies. (b) Estimated clonal frequencies for four cases are shown as the distribution of posterior probabilities from the pyclone model(Supplemental methods). Clonal frequency distributions are coloured by their frequency group membership. (c) (left) Relationship of mutation abundance (synonymous and non-synonymous) and the inferred number of clonal clusters. (middle) Distribution and kernel density (red line) of the number of inferred clonal clusters over 54 TNBCs. (right) Kernel density distribution of clonal clusters for basal (red) and non-basal (grey) tumours.

Comment in

Similar articles

See all similar articles

Cited by 721 PubMed Central articles

See all "Cited by" articles

Publication types

MeSH terms