DART: Denoising Algorithm based on Relevance network Topology improves molecular pathway activity inference

BMC Bioinformatics. 2011 Oct 19:12:403. doi: 10.1186/1471-2105-12-403.

Abstract

Background: Inferring molecular pathway activity is an important step towards reducing the complexity of genomic data, understanding the heterogeneity in clinical outcome, and obtaining molecular correlates of cancer imaging traits. Increasingly, approaches towards pathway activity inference combine molecular profiles (e.g gene or protein expression) with independent and highly curated structural interaction data (e.g protein interaction networks) or more generally with prior knowledge pathway databases. However, it is unclear how best to use the pathway knowledge information in the context of molecular profiles of any given study.

Results: We present an algorithm called DART (Denoising Algorithm based on Relevance network Topology) which filters out noise before estimating pathway activity. Using simulated and real multidimensional cancer genomic data and by comparing DART to other algorithms which do not assess the relevance of the prior pathway information, we here demonstrate that substantial improvement in pathway activity predictions can be made if prior pathway information is denoised before predictions are made. We also show that genes encoding hubs in expression correlation networks represent more reliable markers of pathway activity. Using the Netpath resource of signalling pathways in the context of breast cancer gene expression data we further demonstrate that DART leads to more robust inferences about pathway activity correlations. Finally, we show that DART identifies a hypothesized association between oestrogen signalling and mammographic density in ER+ breast cancer.

Conclusions: Evaluating the consistency of prior information of pathway databases in molecular tumour profiles may substantially improve the subsequent inference of pathway activity in clinical tumour specimens. This de-noising strategy should be incorporated in approaches which attempt to infer pathway activity from prior pathway models.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Breast Neoplasms / diagnosis
  • Breast Neoplasms / genetics
  • Computer Simulation
  • Female
  • Gene Expression Profiling
  • Humans
  • Lung Neoplasms / genetics
  • Mammography
  • Neoplasms / genetics*
  • Protein Interaction Maps*
  • Signal Transduction