Global correlation analysis between redundant probe sets using a large collection of Arabidopsis ath1 expression profiling data

Comput Syst Bioinformatics Conf. 2006:223-6.


Oligo-based expression microarrays from Affymetrix typically contain thousands of redundant probe sets that match different regions of the same gene. We used linear regression and correlation to survey redundant probe set behavior across nearly 500 quality-screened experiments from the Arabidopsis ATH1 array manufactured by Affymetrix. We found that expression values from redundant probe set pairs were often poorly correlated. Pre-filtering expression results using MAS5.0 "present-absent" calls increased the overall percentage of well-correlated probe sets. However, poor correlation was still observed for a substantial number of probe set pairs. Visual inspection of non-correlated probe sets' target genes suggests that some may be inappropriately merged gene models and represent independently expressed, but neighboring loci. Others may reflect differential regulation of alternative 3-prime end processing. Results are on-line at

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Alternative Splicing
  • Arabidopsis / genetics*
  • Arabidopsis Proteins / genetics*
  • Computational Biology / methods*
  • Databases, Genetic
  • Gene Expression Profiling*
  • Gene Expression Regulation*
  • Gene Expression Regulation, Plant*
  • Genes, Plant*
  • Genetic Techniques
  • Genomics / methods*
  • Homeodomain Proteins / genetics*
  • Models, Genetic
  • Oligonucleotide Array Sequence Analysis
  • Regression Analysis
  • Transcription Factors / genetics*


  • Arabidopsis Proteins
  • Homeodomain Proteins
  • KNAT5 protein, Arabidopsis
  • Transcription Factors