Under the coalescent model, the random number nt of lineages ancestral to a sample is nearly deterministic as a function of time when nt is moderate to large in value, and it is well approximated by its expectation E[nt]. In turn, this expectation is well approximated by simple deterministic functions that are easy to compute. Such deterministic functions have been applied to estimate allele age, effective population size, and genetic diversity, and they have been used to study properties of models of infectious disease dynamics. Although a number of simple approximations of E[nt] have been derived and applied to problems of population-genetic inference, the theoretical accuracy of the resulting approximate formulas and the inferences obtained using these approximations is not known, and the range of problems to which they can be applied is not well understood. Here, we demonstrate general procedures by which the approximation nt≈E[nt] can be used to reduce the computational complexity of coalescent formulas, and we show that the resulting approximations converge to their true values under simple assumptions. Such approximations provide alternatives to exact formulas that are computationally intractable or numerically unstable when the number of sampled lineages is moderate or large. We also extend an existing class of approximations of E[nt] to the case of multiple populations of time-varying size with migration among them. Our results facilitate the use of the deterministic approximation nt≈E[nt] for deriving functionally simple, computationally efficient, and numerically stable approximations of coalescent formulas under complicated demographic scenarios.
Keywords: Approximation; Coalescent; Computational complexity.
Copyright © 2014 Elsevier Inc. All rights reserved.