SNPflow: a lightweight application for the processing, storing and automatic quality checking of genotyping assays

PLoS One. 2013;8(3):e59508. doi: 10.1371/journal.pone.0059508. Epub 2013 Mar 19.

Abstract

Single nucleotide polymorphisms (SNPs) play a prominent role in modern genetics. Current genotyping technologies such as Sequenom iPLEX, ABI TaqMan and KBioscience KASPar made the genotyping of huge SNP sets in large populations straightforward and allow the generation of hundreds of thousands of genotypes even in medium sized labs. While data generation is straightforward, the subsequent data conversion, storage and quality control steps are time-consuming, error-prone and require extensive bioinformatic support. In order to ease this tedious process, we developed SNPflow. SNPflow is a lightweight, intuitive and easily deployable application, which processes genotype data from Sequenom MassARRAY (iPLEX) and ABI 7900HT (TaqMan, KASPar) systems and is extendible to other genotyping methods as well. SNPflow automatically converts the raw output files to ready-to-use genotype lists, calculates all standard quality control values such as call rate, expected and real amount of replicates, minor allele frequency, absolute number of discordant replicates, discordance rate and the p-value of the HWE test, checks the plausibility of the observed genotype frequencies by comparing them to HapMap/1000-Genomes, provides a module for the processing of SNPs, which allow sex determination for DNA quality control purposes and, finally, stores all data in a relational database. SNPflow runs on all common operating systems and comes as both stand-alone version and multi-user version for laboratory-wide use. The software, a user manual, screenshots and a screencast illustrating the main features are available at http://genepi-snpflow.i-med.ac.at.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Databases, Genetic*
  • Genotype
  • Polymorphism, Single Nucleotide / genetics*
  • Software*

Grants and funding

Hansi Weissensteiner was supported by a scholarship from the Autonomous Province of Bozen/Bolzano (South Tyrol) and Sebastian Schönherr by a scholarship from the University of Innsbruck (Doctoral grant for young researchers, MIP10/2009/3). The project was supported by grants from the “Standortagentur Tirol” and the “Genomics of Lipid-Associated Disorders-GOLD” of the Austrian Genome Research Programme GEN-AU to Florian Kronenberg. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.