Limits of homology detection by pairwise sequence comparison

Bioinformatics. 2001 Apr;17(4):338-42. doi: 10.1093/bioinformatics/17.4.338.

Abstract

Motivation: Noise in database searches resulting from random sequence similarities increases as the databases expand rapidly. The noise problems are not a technical shortcoming of the database search programs, but a logical consequence of the idea of homology searches. The effect can be observed in simulation experiments.

Results: We have investigated noise levels in pairwise alignment based database searches. The noise levels of 38 releases of the SwissProt database, display perfect logarithmic growth with the total length of the databases. Clustering of real biological sequences reduces noise levels, but the effect is marginal.

MeSH terms

  • Computer Simulation
  • Databases, Factual*
  • Mathematical Computing
  • Models, Statistical
  • Proteins / analysis*
  • Sequence Alignment*
  • Sequence Homology, Nucleic Acid*

Substances

  • Proteins