Identification of genetic networks from a small number of gene expression patterns under the Boolean network model

Pac Symp Biocomput. 1999;17-28. doi: 10.1142/9789814447300_0003.


Liang, Fuhrman and Somogyi (PSB98, 18-29, 1998) have described an algorithm for inferring genetic network architectures from state transition tables which correspond to time series of gene expression patterns, using the Boolean network model. Their results of computational experiments suggested that a small number of state transition (INPUT/OUTPUT) pairs are sufficient in order to infer the original Boolean network correctly. This paper gives a mathematical proof for their observation. Precisely, this paper devises a much simpler algorithm for the same problem and proves that, if the indegree of each node (i.e., the number of input nodes to each node) is bounded by a constant, only O(log n) state transition pairs (from 2n pairs) are necessary and sufficient to identify the original Boolean network of n nodes correctly with high probability. We made computational experiments in order to expose the constant factor involved in O(log n) notation. The computational results show that the Boolean network of size 100,000 can be identified by our algorithm from about 100 INPUT/OUTPUT pairs if the maximum indegree is bounded by 2. It is also a merit of our algorithm that the algorithm is conceptually so simple that it is extensible for more realistic network models.

