Identification of high-confidence somatic mutations in whole genome sequence of formalin-fixed breast cancer specimens

Nucleic Acids Res. 2012 Aug;40(14):e107. doi: 10.1093/nar/gks299. Epub 2012 Apr 6.


The utilization of archived, formalin-fixed paraffin-embedded (FFPE) tumor samples for massive parallel sequencing has been challenging due to DNA damage and contamination with normal stroma. Here, we perform whole genome sequencing of DNA isolated from two triple-negative breast cancer tumors archived for >11 years as 5 µm FFPE sections and matched germline DNA. The tumor samples show differing amounts of FFPE damaged DNA sequencing reads revealed as relatively high alignment mismatch rates enriched for C · G > T · A substitutions compared to germline samples. This increase in mismatch rate is observable with as few as one million reads, allowing for an upfront evaluation of the sample integrity before whole genome sequencing. By applying innovative quality filters incorporating global nucleotide mismatch rates and local mismatch rates, we present a method to identify high-confidence somatic mutations even in the presence of FFPE induced DNA damage. This results in a breast cancer mutational profile consistent with previous studies and revealing potentially important functional mutations. Our study demonstrates the feasibility of performing genome-wide deep sequencing analysis of FFPE archived tumors of limited sample size such as residual cancer after treatment or metastatic biopsies.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Artifacts
  • Breast Neoplasms / genetics*
  • DNA Damage
  • DNA Mutational Analysis / methods*
  • DNA Mutational Analysis / standards
  • Female
  • Fixatives*
  • Formaldehyde*
  • Genome, Human
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Mutation
  • Paraffin Embedding


  • Fixatives
  • Formaldehyde