Statistical analysis of microarray data: a Bayesian approach

Biostatistics. 2003 Oct;4(4):597-620. doi: 10.1093/biostatistics/4.4.597.

Abstract

The potential of microarray data is enormous. It allows us to monitor the expression of thousands of genes simultaneously. A common task with microarray is to determine which genes are differentially expressed between two samples obtained under two different conditions. Recently, several statistical methods have been proposed to perform such a task when there are replicate samples under each condition. Two major problems arise with microarray data. The first one is that the number of replicates is very small (usually 2-10), leading to noisy point estimates. As a consequence, traditional statistics that are based on the means and standard deviations, e.g. t-statistic, are not suitable. The second problem is that the number of genes is usually very large (approximately 10,000), and one is faced with an extreme multiple testing problem. Most multiple testing adjustments are relatively conservative, especially when the number of replicates is small. In this paper we present an empirical Bayes analysis that handles both problems very well. Using different parametrizations, we develop four statistics that can be used to test hypotheses about the means and/or variances of the gene expression levels in both one- and two-sample problems. The methods are illustrated using experimental data with prior knowledge. In addition, we present the result of a simulation comparing our methods to well-known statistics and multiple testing adjustments.

Publication types

  • Comparative Study
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Bacillus anthracis / drug effects
  • Bacillus anthracis / genetics
  • Bayes Theorem*
  • Carbon Dioxide / pharmacology
  • Computer Simulation
  • Data Interpretation, Statistical
  • Gene Expression Profiling / statistics & numerical data
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data*
  • ROC Curve
  • Sample Size

Substances

  • Carbon Dioxide