FREQUENT SUBGRAPH MINING OF PERSONALIZED SIGNALING PATHWAY NETWORKS GROUPS PATIENTS WITH FREQUENTLY DYSREGULATED DISEASE PATHWAYS AND PREDICTS PROGNOSIS

Pac Symp Biocomput. 2017;22:402-413. doi: 10.1142/9789813207813_0038.

Abstract

Motivation: Large scale genomics studies have generated comprehensive molecular characterization of numerous cancer types. Subtypes for many tumor types have been established; however, these classifications are based on molecular characteristics of a small gene sets with limited power to detect dysregulation at the patient level. We hypothesize that frequent graph mining of pathways to gather pathways functionally relevant to tumors can characterize tumor types and provide opportunities for personalized therapies.

Results: In this study we present an integrative omics approach to group patients based on their altered pathway characteristics and show prognostic differences within breast cancer (p < 9:57E - 10) and glioblastoma multiforme (p < 0:05) patients. We were able validate this approach in secondary RNA-Seq datasets with p < 0:05 and p < 0:01 respectively. We also performed pathway enrichment analysis to further investigate the biological relevance of dysregulated pathways. We compared our approach with network-based classifier algorithms and showed that our unsupervised approach generates more robust and biologically relevant clustering whereas previous approaches failed to report specific functions for similar patient groups or classify patients into prognostic groups.

Conclusions: These results could serve as a means to improve prognosis for future cancer patients, and to provide opportunities for improved treatment options and personalized interventions. The proposed novel graph mining approach is able to integrate PPI networks with gene expression in a biologically sound approach and cluster patients in to clinically distinct groups. We have utilized breast cancer and glioblastoma multiforme datasets from microarray and RNA-Seq platforms and identified disease mechanisms differentiating samples.

Supplementary information: Supplementary methods, figures, tables and code are available at https://github.com/bebeklab/dysprog.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Breast Neoplasms / classification
  • Breast Neoplasms / genetics
  • Cluster Analysis
  • Computational Biology
  • Data Mining / methods*
  • Databases, Nucleic Acid / statistics & numerical data
  • Disease / classification*
  • Disease / genetics*
  • Female
  • Gene Expression Profiling / statistics & numerical data
  • Glioblastoma / classification
  • Glioblastoma / genetics
  • Humans
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data
  • Precision Medicine / statistics & numerical data
  • Prognosis
  • Protein Interaction Maps / genetics
  • Signal Transduction / genetics