MVDA: a multi-view genomic data integration methodology

BMC Bioinformatics. 2015 Aug 19:16:261. doi: 10.1186/s12859-015-0680-3.

Abstract

Background: Multiple high-throughput molecular profiling by omics technologies can be collected for the same individuals. Combining these data, rather than exploiting them separately, can significantly increase the power of clinically relevant patients subclassifications.

Results: We propose a multi-view approach in which the information from different data layers (views) is integrated at the levels of the results of each single view clustering iterations. It works by factorizing the membership matrices in a late integration manner. We evaluated the effectiveness and the performance of our method on six multi-view cancer datasets. In all the cases, we found patient sub-classes with statistical significance, identifying novel sub-groups previously not emphasized in literature. Our method performed better as compared to other multi-view clustering algorithms and, unlike other existing methods, it is able to quantify the contribution of single views on the final results.

Conclusion: Our observations suggest that integration of prior information with genomic features in the subtyping analysis is an effective strategy in identifying disease subgroups. The methodology is implemented in R and the source code is available online at http://neuronelab.unisa.it/a-multi-view-genomic-data-integration-methodology/ .

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • Genomics / methods*
  • MicroRNAs / genetics
  • MicroRNAs / metabolism
  • Sequence Analysis, RNA

Substances

  • MicroRNAs