Tissue specific tumor-gene link prediction through sampling based GNN using a heterogeneous network

Med Biol Eng Comput. 2024 Apr 18. doi: 10.1007/s11517-024-03087-y. Online ahead of print.

Abstract

A tissue sample is a valuable resource for understanding a patient's symptoms and health status in relation to tumor growth. Recent research seeks to establish a connection between tissue-specific tumor samples and genetic markers (genes). This breakthrough has paved the way for personalized cancer therapies. With this motivation, the proposed model constructs a heterogeneous network based on tumor sample-gene relation data and gene-gene interaction data. This network also incorporates tissue-specific gene expression and primary site-based gene counts as features, enabling tissue-specific predictions. Graph neural networks (GNNs) have proven effective in modeling complex interactions and predicting links within this network. The proposed model has successfully predicted tumor-gene associations by leveraging sampling-based GNNs and link layer embedding. The model's performance metrics, such as AUC-ROC scores, reached approximately 94%, demonstrating the potential of this heterogeneous network in predicting tissue-specific tumor sample-gene links. This paper's findings highlight the importance of tissue-specific associations in cancer research.

Keywords: Data integration; Graph embedding; Graph neural networks (GNNs); Heterogeneous network; Tissue specific cancer research.