Estimation and reliability of molecular sequence alignments

Biometrics. 1995 Mar;51(1):100-13.


The problem of estimating the relatedness of a pair of biological sequences is addressed. A stochastic model of sequence evolution is described that allows insertion and deletion as well as replacement of amino acid residues (or substitution of nucleotides) over time. An expectation-maximization (EM) algorithm that obtains maximum likelihood estimates of the model parameters is introduced. The method assumes that the sequences are related by descent from a common ancestor but the alignment (i.e., the precise evolutionary correspondence between residues in each sequence) is unknown. Results from the E-step of the EM algorithm are used to assess the likelihood that any two residues are related by direct descent from a common ancestor.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Animals
  • Base Sequence*
  • Biological Evolution*
  • Biometry / methods
  • Mathematics
  • Models, Genetic
  • Models, Statistical*
  • Molecular Sequence Data
  • Probability
  • Stochastic Processes