'Gene shaving' as a method for identifying distinct sets of genes with similar expression patterns

T Hastie; R Tibshirani; M B Eisen; A Alizadeh; R Levy; L Staudt; W C Chan; D Botstein; P Brown

doi:10.1186/gb-2000-1-2-research0003

'Gene shaving' as a method for identifying distinct sets of genes with similar expression patterns

Genome Biol. 2000;1(2):RESEARCH0003. doi: 10.1186/gb-2000-1-2-research0003. Epub 2000 Aug 4.

Authors

T Hastie¹, R Tibshirani, M B Eisen, A Alizadeh, R Levy, L Staudt, W C Chan, D Botstein, P Brown

Affiliation

¹ Department of Statistics, Sequoia Hall, Stanford University, Stanford, CA 94305, USA. tibs@stat.stanford.edu

Abstract

Background: Large gene expression studies, such as those conducted using DNA arrays, often provide millions of different pieces of data. To address the problem of analyzing such data, we describe a statistical method, which we have called 'gene shaving'. The method identifies subsets of genes with coherent expression patterns and large variation across conditions. Gene shaving differs from hierarchical clustering and other widely used methods for analyzing gene expression studies in that genes may belong to more than one cluster, and the clustering may be supervised by an outcome measure. The technique can be 'unsupervised', that is, the genes and samples are treated as unlabeled, or partially or fully supervised by using known properties of the genes or samples to assist in finding meaningful groupings.

Results: We illustrate the use of the gene shaving method to analyze gene expression measurements made on samples from patients with diffuse large B-cell lymphoma. The method identifies a small cluster of genes whose expression is highly predictive of survival.

Conclusions: The gene shaving method is a potentially useful tool for exploration of gene expression data and identification of interesting clusters of genes worth further investigation.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Algorithms*
Cluster Analysis*
Computational Biology / methods
Gene Expression Profiling / methods*
Gene Expression Regulation, Neoplastic*
Humans
Lymphoma, B-Cell / diagnosis
Lymphoma, B-Cell / genetics*
Lymphoma, B-Cell / mortality
Lymphoma, Large B-Cell, Diffuse / diagnosis
Lymphoma, Large B-Cell, Diffuse / genetics*
Lymphoma, Large B-Cell, Diffuse / mortality
Oligonucleotide Array Sequence Analysis / methods*
RNA, Messenger / analysis
RNA, Messenger / genetics
RNA, Messenger / metabolism
Survival Analysis

Substances

RNA, Messenger

Abstract

Publication types

MeSH terms

Substances

Grants and funding