An Augmented High-Dimensional Graphical Lasso Method to Incorporate Prior Biological Knowledge for Global Network Learning

Front Genet. 2022 Jan 27:12:760299. doi: 10.3389/fgene.2021.760299. eCollection 2021.


Biological networks are often inferred through Gaussian graphical models (GGMs) using gene or protein expression data only. GGMs identify conditional dependence by estimating a precision matrix between genes or proteins. However, conventional GGM approaches often ignore prior knowledge about protein-protein interactions (PPI). Recently, several groups have extended GGM to weighted graphical Lasso (wGlasso) and network-based gene set analysis (Netgsa) and have demonstrated the advantages of incorporating PPI information. However, these methods are either computationally intractable for large-scale data, or disregard weights in the PPI networks. To address these shortcomings, we extended the Netgsa approach and developed an augmented high-dimensional graphical Lasso (AhGlasso) method to incorporate edge weights in known PPI with omics data for global network learning. This new method outperforms weighted graphical Lasso-based algorithms with respect to computational time in simulated large-scale data settings while achieving better or comparable prediction accuracy of node connections. The total runtime of AhGlasso is approximately five times faster than weighted Glasso methods when the graph size ranges from 1,000 to 3,000 with a fixed sample size (n = 300). The runtime difference between AhGlasso and weighted Glasso increases when the graph size increases. Using proteomic data from a study on chronic obstructive pulmonary disease, we demonstrate that AhGlasso improves protein network inference compared to the Netgsa approach by incorporating PPI information.

Keywords: Gaussian graphical model; gene network; graphical Lasso; protein-protein interaction; systems biology.