Biases in amino acid replacement matrices and alignment scores due to rate heterogeneity

J Comput Biol. Summer 1996;3(2):307-18. doi: 10.1089/cmb.1996.3.307.

Abstract

Empirically derived amino acid replacement matrices are widely used in sequence comparison and database searches. We consider an extension of the usual Markov process model of protein evolution that admits site to site rate heterogeneity and demonstrates that rate heterogeneity can introduce a bias in estimated replacement probabilities and the corresponding alignment scores derived from these matrices. We suggest an approach to obtain unbiased estimates of replacement probabilities and alignment scores and derive the details for the case where rates are assumed to vary according to a gamma distribution.

MeSH terms

  • Amino Acid Sequence
  • Bias
  • Biometry
  • Evolution, Molecular
  • Markov Chains
  • Models, Theoretical
  • Proteins / chemistry
  • Proteins / genetics
  • Sequence Alignment / statistics & numerical data*

Substances

  • Proteins