Variance stabilization and normalization for one-color microarray data using a data-driven multiscale approach
- PMID: 16877753
- DOI: 10.1093/bioinformatics/btl412
Variance stabilization and normalization for one-color microarray data using a data-driven multiscale approach
Abstract
Motivation: Many standard statistical techniques are effective on data that are normally distributed with constant variance. Microarray data typically violate these assumptions since they come from non-Gaussian distributions with a non-trivial mean-variance relationship. Several methods have been proposed that transform microarray data to stabilize variance and draw its distribution towards the Gaussian. Some methods, such as log or generalized log, rely on an underlying model for the data. Others, such as the spread-versus-level plot, do not. We propose an alternative data-driven multiscale approach, called the Data-Driven Haar-Fisz for microarrays (DDHFm) with replicates. DDHFm has the advantage of being 'distribution-free' in the sense that no parametric model for the underlying microarray data is required to be specified or estimated; hence, DDHFm can be applied very generally, not just to microarray data.
Results: DDHFm achieves very good variance stabilization of microarray data with replicates and produces transformed intensities that are approximately normally distributed. Simulation studies show that it performs better than other existing methods. Application of DDHFm to real one-color cDNA data validates these results.
Availability: The R package of the Data-Driven Haar-Fisz transform (DDHFm) for microarrays is available in Bioconductor and CRAN.
Similar articles
-
A new outlier removal approach for cDNA microarray normalization.Biotechniques. 2009 Aug;47(2):691-2, 694-700. doi: 10.2144/000113195. Biotechniques. 2009. PMID: 19737130
-
Selection and validation of normalization methods for c-DNA microarrays using within-array replications.Bioinformatics. 2007 Sep 15;23(18):2391-8. doi: 10.1093/bioinformatics/btm361. Epub 2007 Jul 27. Bioinformatics. 2007. PMID: 17660210
-
Multidimensional local false discovery rate for microarray studies.Bioinformatics. 2006 Mar 1;22(5):556-65. doi: 10.1093/bioinformatics/btk013. Epub 2005 Dec 20. Bioinformatics. 2006. PMID: 16368770
-
Classification based upon gene expression data: bias and precision of error rates.Bioinformatics. 2007 Jun 1;23(11):1363-70. doi: 10.1093/bioinformatics/btm117. Epub 2007 Mar 28. Bioinformatics. 2007. PMID: 17392326 Review.
-
Normalization and quantification of differential expression in gene expression microarrays.Brief Bioinform. 2006 Jun;7(2):166-77. doi: 10.1093/bib/bbl002. Epub 2006 Mar 7. Brief Bioinform. 2006. PMID: 16772260 Review.
Cited by
-
Variance estimation in the analysis of microarray data.J R Stat Soc Series B Stat Methodol. 2009 Apr 1;71(2):425-445. doi: 10.1111/j.1467-9868.2008.00690.x. J R Stat Soc Series B Stat Methodol. 2009. PMID: 19750023 Free PMC article.
-
A gene selection method for GeneChip array data with small sample sizes.BMC Genomics. 2011 Dec 23;12 Suppl 5(Suppl 5):S7. doi: 10.1186/1471-2164-12-S5-S7. Epub 2011 Dec 23. BMC Genomics. 2011. PMID: 22369149 Free PMC article.
-
Deciphering neuronal deficit and protein profile changes in human brain organoids from patients with creatine transporter deficiency.Elife. 2023 Oct 13;12:RP88459. doi: 10.7554/eLife.88459. Elife. 2023. PMID: 37830910 Free PMC article.
-
Irgm1 protects hematopoietic stem cells by negative regulation of IFN signaling.Blood. 2011 Aug 11;118(6):1525-33. doi: 10.1182/blood-2011-01-328682. Epub 2011 Jun 1. Blood. 2011. PMID: 21633090 Free PMC article.
-
Mesenchymal stromal/stem cells modulate response to experimental sepsis-induced lung injury via regulation of miR-27a-5p in recipient mice.Thorax. 2020 Jul;75(7):556-567. doi: 10.1136/thoraxjnl-2019-213561. Epub 2020 Jun 16. Thorax. 2020. PMID: 32546573 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
