The statistical distribution of the intensity of pixels within spots of DNA microarrays: what is the appropriate single-value representative?

Appl Bioinformatics. 2003;2(4):229-39.

Abstract

This paper opens a discussion about an important issue in the analysis of data from spotted DNA microarrays: how to summarise into a single value the distribution for the intensity values of the pixels within a spot. Although the most popular statistic used is the median, there is no clear study demonstrating why it is more appropriate than other measures of central tendency such as the mean or the mode. Here, we argue that the median intensity is not the most appropriate measure for many common cases and discuss a frequently encountered case of a 'doughnut'-shaped spot for which the mode is closest to the 'expected' spot intensity. For an 'ideal' spot with a clear boundary and uniformly hybridised, the intensity of its pixels should approximately be normally distributed. In practical situations, these two requirements are often not met due to the physical properties of pins and the particularities of the printing and hybridisation processes. As a consequence, the distribution of the intensity of the pixels is usually negatively skewed. This asymmetry results in a larger displacement for the mean and median than for the mode from the ideal situation mentioned above.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms
  • Artifacts
  • Data Interpretation, Statistical*
  • Gene Expression Profiling / methods*
  • Image Interpretation, Computer-Assisted / methods*
  • Image Interpretation, Computer-Assisted / standards
  • Microscopy, Fluorescence / methods*
  • Microscopy, Fluorescence / standards
  • Models, Genetic*
  • Models, Statistical*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Oligonucleotide Array Sequence Analysis / standards
  • Quality Control
  • Reproduction
  • Sensitivity and Specificity
  • Sequence Analysis, DNA / methods*
  • Sequence Analysis, DNA / standards
  • Signal Processing, Computer-Assisted