Flexible parametrization of graph-theoretical features from individual-specific networks for prediction

Stat Med. 2024 Apr 25. doi: 10.1002/sim.10091. Online ahead of print.

Abstract

Statistical techniques are needed to analyze data structures with complex dependencies such that clinically useful information can be extracted. Individual-specific networks, which capture dependencies in complex biological systems, are often summarized by graph-theoretical features. These features, which lend themselves to outcome modeling, can be subject to high variability due to arbitrary decisions in network inference and noise. Correlation-based adjacency matrices often need to be sparsified before meaningful graph-theoretical features can be extracted, requiring the data analysts to determine an optimal threshold. To address this issue, we propose to incorporate a flexible weighting function over the full range of possible thresholds to capture the variability of graph-theoretical features over the threshold domain. The potential of this approach, which extends concepts from functional data analysis to a graph-theoretical setting, is explored in a plasmode simulation study using real functional magnetic resonance imaging (fMRI) data from the Autism Brain Imaging Data Exchange (ABIDE) Preprocessed initiative. The simulations show that our modeling approach yields accurate estimates of the functional form of the weight function, improves inference efficiency, and achieves a comparable or reduced root mean square prediction error compared to competitor modeling approaches. This assertion holds true in settings where both complex functional forms underlie the outcome-generating process and a universal threshold value is employed. We demonstrate the practical utility of our approach by using resting-state fMRI data to predict biological age in children. Our study establishes the flexible modeling approach as a statistically principled, serious competitor to ad-hoc methods with superior performance.

Keywords: complex systems; fMRI; networks; prediction; splines.