CausNet-partial : 'Partial Generational Orderings' based search for optimal sparse Bayesian networks via dynamic programming with parent set constraints

Res Sq [Preprint]. 2024 Mar 7:rs.3.rs-4021074. doi: 10.21203/rs.3.rs-4021074/v1.

Abstract

Background: In our recent work, we developed a novel dynamic programming algorithm to find optimal Bayesian networks (BNs) with parent set constraints. This 'generational orderings' based dynamic programming search algorithm - CausNet - efficiently searches the space of possible BNs given the possible parent sets. The algorithm supports both continuous and categorical data, as well as continuous, binary and survival outcomes. In the present work, we develop a variant of CausNet - CausNet-partial - which searches the space of 'partial generational orderings', which further reduces the search space and is suited for finding smaller sparse optimal Bayesian networks; and can be applied to 1000s of variables.

Results: We test this method both on synthetic and real data. Our algorithm performs better than three state-of-art algorithms that are currently used extensively to find optimal BNs. We apply it to simulated continuous data and also to a benchmark discrete Bayesian network ALARM, a Bayesian network designed to provide an alarm message system for patient monitoring. We first apply the original CausNet and then CausNet-partial varying the partial order from 5 to 2. CausNet-partial discovers small sparse networks with drastically reduced runtime as expected from theory.

Conclusions: Our partial generational orderings based search for small optimal networks, is both an efficient and highly scalable approach for finding optimal sparse and small Bayesian Networks and can be applied to 1000s of variables. Using specifiable parameters - correlation, FDR cutoffs, in-degree, and partial order - one can increase or decrease the number of nodes and density of the networks. Availability of two scoring option - BIC and Bge - and implementation for survival outcomes and mixed data types makes our algorithm very suitable for many types of high dimensional data in a variety of fields.

Keywords: Optimal Bayesian Network; dynamic programming; generational orderings.

Publication types

  • Preprint