GCfix: a fast and accurate fragment length-specific method for correcting GC bias in cell-free DNA

Bioinformatics. 2025 Jun 2;41(6):btaf293. doi: 10.1093/bioinformatics/btaf293.

Abstract

Motivation: Cell-free DNA (cfDNA) analysis has wide-ranging clinical applications due to its noninvasive nature. However, cfDNA fragmentomics and copy number analysis can be complicated by GC bias. There is a lack of GC correction software based on rigorous cfDNA GC bias analysis. Furthermore, there is no standardized metric for comparing GC bias correction methods across large sample sets, nor a rigorous experiment setup to demonstrate their effectiveness on cfDNA data at various coverage levels.

Results: We present GCfix, a method for robust GC bias correction in cfDNA data across diverse coverages. Developed following an in-depth analysis of cfDNA GC bias at the region and fragment length levels, GCfix is both fast and accurate. It works on all reference genomes and generates correction factors, tagged BAM files, and corrected coverage tracks. We also introduce two orthogonal performance metrics for (i) comparing the fragment count density distribution of GC content between expected and corrected samples, and (ii) evaluating coverage profile improvement post-correction. GCfix outperforms existing cfDNA GC bias correction methods on these metrics.

Availability and implementation: GCfix software and code for reproducing the figures are publicly accessible on GitHub: https://github.com/Rafeed-bot/GCfix_Software.

MeSH terms

  • Algorithms
  • Base Composition
  • Cell-Free Nucleic Acids* / chemistry
  • Cell-Free Nucleic Acids* / genetics
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Sequence Analysis, DNA* / methods
  • Software*

Substances

  • Cell-Free Nucleic Acids