Algorithms for identifying Boolean networks and related biological networks based on matrix multiplication and fingerprint function

J Comput Biol. 2000;7(3-4):331-43. doi: 10.1089/106652700750050817.

Abstract

Due to the recent progress of the DNA microarray technology, a large number of gene expression profile data are being produced. How to analyze gene expression data is an important topic in computational molecular biology. Several studies have been done using the Boolean network as a model of a genetic network. This paper proposes efficient algorithms for identifying Boolean networks of bounded indegree and related biological networks, where identification of a Boolean network can be formalized as a problem of identifying many Boolean functions simultaneously. For the identification of a Boolean network, an O(mnD+1) time naive algorithm and a simple O (mnD) time algorithm are known, where n denotes the number of nodes, m denotes the number of examples, and D denotes the maximum in degree. This paper presents an improved O(momega-2nD + mnD+omega-3) time Monte-Carlo type randomized algorithm, where omega is the exponent of matrix multiplication (currently, omega < 2.376). The algorithm is obtained by combining fast matrix multiplication with the randomized fingerprint function for string matching. Although the algorithm and its analysis are simple, the result is nontrivial and the technique can be applied to several related problems.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology
  • DNA Fingerprinting / statistics & numerical data
  • Data Interpretation, Statistical
  • Gene Expression Profiling / statistics & numerical data*
  • Models, Genetic
  • Monte Carlo Method
  • Oligonucleotide Array Sequence Analysis