Microarray data simulator for improved selection of differentially expressed genes

Cancer Biol Ther. Jul-Aug 2003;2(4):383-91. doi: 10.4161/cbt.2.4.431.


The development of microarray technology has allowed researchers to measure expression levels of thousands of genes simultaneously. Analysis of these data requires the best normalization and statistical approaches to account for the biological and technical variability inherent in the technique. To approach this problem we have developed a publicly available simulator of microarray hybridization experiments that can be used to help assess the accuracy of bioinformatic tools in discovering significant genes. After analyzing microarray hybridization experiments from over 50 samples, an estimate of various degrees of technical and biological variability was obtained. This information was used to develop a simulator of microarray hybridization data which modeled "normal tissue samples" and "diseased tissue samples" with known, defined, changes in gene expression (a "gold standard"). The data derived from the simulator were then used to evaluate the true positive and false negative rates of several normalization procedures and gene selection techniques. We found that the type of normalization approach used was an important aspect of data analysis. Global normalization was the least accurate approach. Evaluation of gene selection techniques showed that "Significance analysis of microarrays" (SAM) and "Patterns of Gene Expression" (PaGE) were more accurate than simple t-test analysis. We provide access to the microarray hybridization simulator as a public resource for biologists to further test new emerging genomic bioinfomatic tools.

Publication types

  • Comparative Study
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Computer Simulation*
  • DNA, Neoplasm / analysis*
  • Female
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation, Neoplastic / genetics*
  • Humans
  • Neoplasms / genetics*
  • Oligonucleotide Array Sequence Analysis / methods*


  • DNA, Neoplasm