Pairwise statistical significance and empirical determination of effective gap opening penalties for protein local sequence alignment

Int J Comput Biol Drug Des. 2008;1(4):347-67. doi: 10.1504/ijcbdd.2008.022207.

Abstract

We evaluate various methods to estimate pairwise statistical significance of a pairwise local sequence alignment in terms of statistical significance accuracy and compare it with popular database search programs in terms of retrieval accuracy on a benchmark database. Results indicate that using pairwise statistical significance using standard substitution matrices is significantly better than database statistical significance reported by BLAST and PSI-BLAST, and that it is comparable and at times significantly better than SSEARCH. An application of pairwise statistical significance to empirically determine effective gap opening penalties for protein local sequence alignment using the widely used BLOSUM matrices is also presented.

Publication types

  • Review

MeSH terms

  • Amino Acid Sequence
  • Base Sequence
  • DNA / chemistry
  • DNA / genetics
  • Databases, Protein / standards
  • Humans
  • Molecular Sequence Data
  • Proteins / chemistry*
  • Proteins / genetics*
  • Sequence Alignment / methods
  • Sequence Analysis, Protein / methods*
  • Sequence Homology, Amino Acid
  • Statistics as Topic / methods
  • Statistics as Topic / standards

Substances

  • Proteins
  • DNA