The Effects of Population Size Histories on Estimates of Selection Coefficients from Time-Series Genetic Data

Mol Biol Evol. 2016 Nov;33(11):3002-3027. doi: 10.1093/molbev/msw173. Epub 2016 Aug 22.


Many approaches have been developed for inferring selection coefficients from time series data while accounting for genetic drift. These approaches have been motivated by the intuition that properly accounting for the population size history can significantly improve estimates of selective strengths. However, the improvement in inference accuracy that can be attained by modeling drift has not been characterized. Here, by comparing maximum likelihood estimates of selection coefficients that account for the true population size history with estimates that ignore drift by assuming allele frequencies evolve deterministically in a population of infinite size, we address the following questions: how much can modeling the population size history improve estimates of selection coefficients? How much can mis-inferred population sizes hurt inferences of selection coefficients? We conduct our analysis under the discrete Wright-Fisher model by deriving the exact probability of an allele frequency trajectory in a population of time-varying size and we replicate our results under the diffusion model. For both models, we find that ignoring drift leads to estimates of selection coefficients that are nearly as accurate as estimates that account for the true population history, even when population sizes are small and drift is high. This result is of interest because inference methods that ignore drift are widely used in evolutionary studies and can be many orders of magnitude faster than methods that account for population sizes.

Keywords: Wright–Fisher; diffusion; inference; selection; time series.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biological Evolution
  • Computer Simulation
  • Gene Frequency
  • Genetic Drift
  • Genetics, Population / methods*
  • Likelihood Functions
  • Models, Genetic*
  • Population Density
  • Selection, Genetic*