A new genotype imputation method with tolerance to high missing rate and rare variants

PLoS One. 2014 Jun 27;9(6):e101025. doi: 10.1371/journal.pone.0101025. eCollection 2014.

Abstract

We report a novel algorithm, iBLUP, to impute missing genotypes by simultaneously and comprehensively using identity by descent and linkage disequilibrium information. The simulation studies showed that the algorithm exhibited drastically tolerance to high missing rate, especially for rare variants than other common imputation methods, e.g. BEAGLE and fastPHASE. At a missing rate of 70%, the accuracy of BEAGLE and fastPHASE dropped to 0.82 and 0.74 respectively while iBLUP retained an accuracy of 0.95. For minor allele, the accuracy of BEAGLE and fastPHASE decreased to -0.1 and 0.03, while iBLUP still had an accuracy of 0.61.We implemented the algorithm in a publicly available software package also named iBLUP. The application of iBLUP for processing real sequencing data in an outbred pig population was demonstrated.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Haplotypes*
  • Polymorphism, Genetic
  • Sensitivity and Specificity
  • Software*
  • Swine

Grant support

This study was supported by the 2011-2012 animal germplasm resources conservation project from Ministry of Agriculture of China, the National Natural Science Foundation of China (grant no 31370043, 31272414, 31101706), and the National 948 Project of China (2012-Z26, 2011-G2A). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.