Haplotype estimation for biobank-scale data sets

Nat Genet. 2016 Jul;48(7):817-20. doi: 10.1038/ng.3583. Epub 2016 Jun 6.


The UK Biobank (UKB) has recently released genotypes on 152,328 individuals together with extensive phenotypic and lifestyle information. We present a new phasing method, SHAPEIT3, that can handle such biobank-scale data sets and results in switch error rates as low as ∼0.3%. The method exhibits O(NlogN) scaling with sample size N, enabling fast and accurate phasing of even larger cohorts.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Biological Specimen Banks*
  • Cohort Studies
  • Computational Biology / methods*
  • Datasets as Topic
  • European Continental Ancestry Group
  • Genetics, Population*
  • Genome, Human
  • Genomics
  • Haplotypes / genetics*
  • High-Throughput Nucleotide Sequencing / statistics & numerical data
  • Humans
  • Polymorphism, Single Nucleotide / genetics
  • Sequence Analysis, DNA / methods
  • United Kingdom