CNVpytor: a tool for copy number variation detection and analysis from read depth and allele imbalance in whole-genome sequencing

Gigascience. 2021 Nov 18;10(11):giab074. doi: 10.1093/gigascience/giab074.

Abstract

Background: Detecting copy number variations (CNVs) and copy number alterations (CNAs) based on whole-genome sequencing data is important for personalized genomics and treatment. CNVnator is one of the most popular tools for CNV/CNA discovery and analysis based on read depth.

Findings: Herein, we present an extension of CNVnator developed in Python-CNVpytor. CNVpytor inherits the reimplemented core engine of its predecessor and extends visualization, modularization, performance, and functionality. Additionally, CNVpytor uses B-allele frequency likelihood information from single-nucleotide polymorphisms and small indels data as additional evidence for CNVs/CNAs and as primary information for copy number-neutral losses of heterozygosity.

Conclusions: CNVpytor is significantly faster than CNVnator-particularly for parsing alignment files (2-20 times faster)-and has (20-50 times) smaller intermediate files. CNV calls can be filtered using several criteria, annotated, and merged over multiple samples. Modular architecture allows it to be used in shared and cloud environments such as Google Colab and Jupyter notebook. Data can be exported into JBrowse, while a lightweight plugin version of CNVpytor for JBrowse enables nearly instant and GUI-assisted analysis of CNVs by any user. CNVpytor release and the source code are available on GitHub at https://github.com/abyzovlab/CNVpytor under the MIT license.

Keywords: Python; copy number alternations; copy number variations; whole-genome sequencing.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • DNA Copy Number Variations*
  • Genomics
  • High-Throughput Nucleotide Sequencing
  • Sequence Analysis, DNA
  • Software*
  • Whole Genome Sequencing