BCFtools/csq: haplotype-aware variant consequences

Bioinformatics. 2017 Jul 1;33(13):2037-2039. doi: 10.1093/bioinformatics/btx100.


Motivation: Prediction of functional variant consequences is an important part of sequencing pipelines, allowing the categorization and prioritization of genetic variants for follow up analysis. However, current predictors analyze variants as isolated events, which can lead to incorrect predictions when adjacent variants alter the same codon, or when a frame-shifting indel is followed by a frame-restoring indel. Exploiting known haplotype information when making consequence predictions can resolve these issues.

Results: BCFtools/csq is a fast program for haplotype-aware consequence calling which can take into account known phase. Consequence predictions are changed for 501 of 5019 compound variants found in the 81.7M variants in the 1000 Genomes Project data, with an average of 139 compound variants per haplotype. Predictions match existing tools when run in localized mode, but the program is an order of magnitude faster and requires an order of magnitude less memory.

Availability and implementation: The program is freely available for commercial and non-commercial use in the BCFtools package which is available for download from http://samtools.github.io/bcftools .

Contact: pd3@sanger.ac.uk.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Algorithms
  • Genetic Variation*
  • Genome, Human*
  • Genomics / methods
  • Haplotypes*
  • Humans
  • INDEL Mutation
  • Sequence Analysis, DNA / methods*
  • Software*