The quality of gene expression microarray data has improved dramatically since the first arrays were introduced in the late 1990s. However, the reproducibility of data generated at multiple laboratory sites remains a matter of concern, especially for scientists who are attempting to combine and analyze data from public repositories. We have carried out a study in which a common set of RNA samples was assayed five times in four different laboratories using Affymetrix GeneChip arrays. We observed dramatic differences in the results across laboratories and identified batch effects in array processing as one of the primary causes for these differences. When batch processing of samples is confounded with experimental factors of interest it is not possible to separate their effects, and lists of differentially expressed genes may include many artifacts. This study demonstrates the substantial impact of sample processing on microarray analysis results and underscores the need for randomization in the laboratory as a means to avoid confounding of biological factors with procedural effects.