Learning deterministic finite automata with a smart state labeling evolutionary algorithm

IEEE Trans Pattern Anal Mach Intell. 2005 Jul;27(7):1063-74. doi: 10.1109/TPAMI.2005.143.

Abstract

Learning a Deterministic Finite Automaton (DFA) from a training set of labeled strings is a hard task that has been much studied within the machine learning community. It is equivalent to learning a regular language by example and has applications in language modeling. In this paper, we describe a novel evolutionary method for learning DFA that evolves only the transition matrix and uses a simple deterministic procedure to optimally assign state labels. We compare its performance with the Evidence Driven State Merging (EDSM) algorithm, one of the most powerful known DFA learning algorithms. We present results on random DFA induction problems of varying target size and training set density. We also studythe effects of noisy training data on the evolutionary approach and on EDSM. On noise-free data, we find that our evolutionary method outperforms EDSM on small sparse data sets. In the case of noisy training data, we find that our evolutionary method consistently outperforms EDSM, as well as other significant methods submitted to two recent competitions.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Artificial Intelligence*
  • Cluster Analysis
  • Computer Simulation
  • Information Storage and Retrieval / methods*
  • Models, Statistical*
  • Natural Language Processing
  • Numerical Analysis, Computer-Assisted
  • Pattern Recognition, Automated / methods*
  • Sequence Alignment / methods
  • Sequence Analysis / methods*
  • Signal Processing, Computer-Assisted*