Sample size for a phylogenetic inference

Mol Biol Evol. 1992 Jul;9(4):753-69. doi: 10.1093/oxfordjournals.molbev.a040757.


The objective of this work is to describe sample-size calculations for the inference of a nonzero central branch length in an unrooted four-species phylogeny. Attention is restricted to independent binary characters, such as might be obtained from an alignment of the purine-pyrimidine sequences of a nucleic acid molecule. A statistical test based on a multinomial model for character-state configurations is described. The importance of including invariable sites in models for sequence change is demonstrated, and their effect on sample size is quantified. The methods are applied to a four-species alignment of small-subunit rRNA sequences derived from two archaebacteria, a eubacteria and a eukaryote. We conclude that the information in these sequences is not sufficient to resolve the branching order of this tree. Estimates of the number of aligned nucleotide positions required to provide a reasonably powerful test are given.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Archaea / classification
  • Models, Statistical
  • Phylogeny*
  • Sampling Studies