A hidden Markov model for investigating recent positive selection through haplotype structure

Theor Popul Biol. 2015 Feb;99:18-30. doi: 10.1016/j.tpb.2014.11.001. Epub 2014 Nov 13.


Recent positive selection can increase the frequency of an advantageous mutant rapidly enough that a relatively long ancestral haplotype will be remained intact around it. We present a hidden Markov model (HMM) to identify such haplotype structures. With HMM identified haplotype structures, a population genetic model for the extent of ancestral haplotypes is then adopted for parameter inference of the selection intensity and the allele age. Simulations show that this method can detect selection under a wide range of conditions and has higher power than the existing frequency spectrum-based method. In addition, it provides good estimate of the selection coefficients and allele ages for strong selection. The method analyzes large data sets in a reasonable amount of running time. This method is applied to HapMap III data for a genome scan, and identifies a list of candidate regions putatively under recent positive selection. It is also applied to several genes known to be under recent positive selection, including the LCT, KITLG and TYRP1 genes in Northern Europeans, and OCA2 in East Asians, to estimate their allele ages and selection coefficients.

Keywords: Allele age; Haplotype structure; Hidden Markov model; Recent positive selection; Selection intensity.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Alleles
  • Asian Continental Ancestry Group
  • Chromosomes
  • Computer Simulation
  • European Continental Ancestry Group
  • Genetics, Population
  • Haplotypes / genetics*
  • Humans
  • Lactase / genetics
  • Markov Chains*
  • Models, Genetic
  • Mutation
  • Selection, Genetic / genetics*
  • Skin Pigmentation / genetics


  • Lactase