Biclustering with heterogeneous variance

Proc Natl Acad Sci U S A. 2013 Jul 23;110(30):12253-8. doi: 10.1073/pnas.1304376110. Epub 2013 Jul 8.

Abstract

In cancer research, as in all of medicine, it is important to classify patients into etiologically and therapeutically relevant subtypes to improve diagnosis and treatment. One way to do this is to use clustering methods to find subgroups of homogeneous individuals based on genetic profiles together with heuristic clinical analysis. A notable drawback of existing clustering methods is that they ignore the possibility that the variance of gene expression profile measurements can be heterogeneous across subgroups, and methods that do not consider heterogeneity of variance can lead to inaccurate subgroup prediction. Research has shown that hypervariability is a common feature among cancer subtypes. In this paper, we present a statistical approach that can capture both mean and variance structure in genetic data. We demonstrate the strength of our method in both synthetic data and in two cancer data sets. In particular, our method confirms the hypervariability of methylation level in cancer patients, and it detects clearer subgroup patterns in lung cancer data.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Cluster Analysis
  • DNA / genetics
  • Gene Expression Profiling*
  • Humans
  • Lung Neoplasms / genetics*

Substances

  • DNA