SeqCNV: a novel method for identification of copy number variations in targeted next-generation sequencing data

BMC Bioinformatics. 2017 Mar 3;18(1):147. doi: 10.1186/s12859-017-1566-3.

Abstract

Background: Targeted next-generation sequencing (NGS) has been widely used as a cost-effective way to identify the genetic basis of human disorders. Copy number variations (CNVs) contribute significantly to human genomic variability, some of which can lead to disease. However, effective detection of CNVs from targeted capture sequencing data remains challenging.

Results: Here we present SeqCNV, a novel CNV calling method designed to use capture NGS data. SeqCNV extracts the read depth information and utilizes the maximum penalized likelihood estimation (MPLE) model to identify the copy number ratio and CNV boundary. We applied SeqCNV to both bacterial artificial clone (BAC) and human patient NGS data to identify CNVs. These CNVs were validated by array comparative genomic hybridization (aCGH).

Conclusions: SeqCNV is able to robustly identify CNVs of different size using capture NGS data. Compared with other CNV-calling methods, SeqCNV shows a significant improvement in both sensitivity and specificity.

Keywords: Copy number variation; Maximum penalized likelihood estimation; Next-generation sequencing.

MeSH terms

  • DNA Copy Number Variations*
  • Genome, Human*
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Likelihood Functions
  • Sensitivity and Specificity
  • Sequence Analysis, DNA / methods
  • Software*