Lingos, finite state machines, and fast similarity searching

J Chem Inf Model. 2006 Sep-Oct;46(5):1912-8. doi: 10.1021/ci6002152.

Abstract

We apply a recently published method of text-based molecular similarity searching (LINGO) to standard data sets for the purpose of quantifying the accuracy of the approach. Our implementation is based on a pattern-matching finite state machine (FSM) which results in fast search times. The accuracy of LINGO is demonstrated to be comparable to that of a path-based fingerprint and offers a simple yet effective method for similarity searching.

MeSH terms

  • Algorithms
  • DNA / chemistry
  • Finite Element Analysis
  • Molecular Structure*
  • Proteins / chemistry

Substances

  • Proteins
  • DNA