Stochastic models inspired by hybridization theory for short oligonucleotide arrays

J Comput Biol. Jul-Aug 2005;12(6):882-93. doi: 10.1089/cmb.2005.12.882.


High density oligonucleotide expression arrays are a widely used tool for the measurement of gene expression on a large scale. Affymetrix GeneChip arrays appear to dominate this market. These arrays use short oligonucleotides to probe for genes in an RNA sample. Due to optical noise, nonspecific hybridization, probe-specific effects, and measurement error, ad hoc measures of expression that summarize probe intensities can lead to imprecise and inaccurate results. Various researchers have demonstrated that expression measures based on simple statistical models can provide great improvements over the ad hoc procedure offered by Affymetrix. Recently, physical models based on molecular hybridization theory have been proposed as useful tools for prediction of, for example, nonspecific hybridization. These physical models show great potential in terms of improving existing expression measures. In this paper, we suggest that the system producing the measured intensities is too complex to be fully described with these relatively simple physical models, and we propose empirically motivated stochastic models that complement the above-mentioned molecular hybridization theory to provide a comprehensive description of the data. We discuss how the proposed model can be used to obtain improved measures of expression useful for the data analysts.

Publication types

  • Comparative Study

MeSH terms

  • Algorithms*
  • Data Interpretation, Statistical
  • Gene Expression Profiling / methods*
  • Models, Genetic*
  • Models, Statistical
  • Nucleic Acid Hybridization
  • Oligonucleotide Array Sequence Analysis / methods*
  • RNA, Messenger / genetics
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Stochastic Processes*


  • RNA, Messenger