The ratio of singletons to the total number of segregating sites is used to estimate a reproduction parameter in a population model of large offspring numbers without having to jointly estimate the mutation rate. For neutral genetic variation, the ratio of singletons to the total number of segregating sites is equivalent to the ratio of total length of external branches to the total length of the gene genealogy. A multinomial maximum likelihood method that takes into account more frequency classes than just the singletons is developed to estimate the parameter of another large offspring number model. The performance of these methods with regard to sample size, mutation rate, and bias, is investigated by simulation. The expected value of the ratio of the total length of external branches to the total length of the whole tree is, using simulation, shown to decrease for the Kingman coalescent as sample size increases, but can increase or decrease, depending on parameter values, for Λ coalescents. Considering ratios of tree statistics, as opposed to considering lengths of various subtrees separately, can yield better insight into the dynamics of gene genealogies.
Copyright © 2011 Elsevier Inc. All rights reserved.