All-FIT: allele-frequency-based imputation of tumor purity from high-depth sequencing data

Bioinformatics. 2020 Apr 1;36(7):2173-2180. doi: 10.1093/bioinformatics/btz865.


Summary: Clinical sequencing aims to identify somatic mutations in cancer cells for accurate diagnosis and treatment. However, most widely used clinical assays lack patient-matched control DNA and additional analysis is needed to distinguish somatic and unfiltered germline variants. Such computational analyses require accurate assessment of tumor cell content in individual specimens. Histological estimates often do not corroborate with results from computational methods that are primarily designed for normal-tumor matched data and can be confounded by genomic heterogeneity and presence of sub-clonal mutations. Allele-frequency-based imputation of tumor (All-FIT) is an iterative weighted least square method to estimate specimen tumor purity based on the allele frequencies of variants detected in high-depth, targeted, clinical sequencing data. Using simulated and clinical data, we demonstrate All-FIT's accuracy and improved performance against leading computational approaches, highlighting the importance of interpreting purity estimates based on expected biology of tumors.

Availability and implementation: Freely available at

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Alleles
  • Computational Biology
  • Gene Frequency
  • High-Throughput Nucleotide Sequencing*
  • Humans
  • Neoplasms / genetics*
  • Software