The International HapMap Project provides a resource of genotypic data on single nucleotide polymorphisms (SNPs), which can be used in various association studies to identify the genetic determinants for phenotypic variations. Prior to the association studies, the HapMap dataset should be preprocessed in order to reduce the computation time and control the multiple testing problem. The less informative SNPs including those with very low genotyping rate and SNPs with rare minor allele frequencies to some extent in one or more population are removed. Some research designs only use SNPs in a subset of HapMap cell lines. Although the HapMap website and other association software packages have provided some basic tools for optimizing these datasets, a fast and user-friendly program to generate the output for filtered genotypic data would be beneficial for association studies. Here, we present a flexible, straight-forward bioinformatics program that can be useful in preparing the HapMap genotypic data for association studies by specifying cell lines and two common filtering criteria: minor allele frequencies and genotyping rate. The software was developed for Microsoft Windows and written in C++.
Availability: The Windows executable and source code in Microsoft Visual C++ are available at Google Code (http://hapmap-filter-v1.googlecode.com/) or upon request. Their distribution is subject to GNU General Public License v3.
Keywords: HapMap; genotype; genotyping rate; minor allele frequency; single nucleotide polymorphism.