A million variables and more: the Fast Greedy Equivalence Search algorithm for learning high-dimensional graphical causal models, with an application to functional magnetic resonance images

Int J Data Sci Anal. 2017 Mar;3(2):121-129. doi: 10.1007/s41060-016-0032-z. Epub 2016 Dec 1.

Abstract

We describe two modifications that parallelize and reorganize caching in the well-known Greedy Equivalence Search (GES) algorithm for discovering directed acyclic graphs on random variables from sample values. We apply one of these modifications, the Fast Greedy Search (FGS) assuming faithfulness, to an i.i.d. sample of 1,000 units to recover with high precision and good recall an average degree 2 directed acyclic graph (DAG) with one million Gaussian variables. We describe a modification of the algorithm to rapidly find the Markov Blanket of any variable in a high dimensional system. Using 51,000 voxels that parcellate an entire human cortex, we apply the FGS algorithm to Blood Oxygenation Level Dependent (BOLD) time series obtained from resting state fMRI.