Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data
- PMID: 16284200
- PMCID: PMC1283542
- DOI: 10.1093/nar/gni179
Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data
Abstract
Genome-wide expression profiling is a powerful tool for implicating novel gene ensembles in cellular mechanisms of health and disease. The most popular platform for genome-wide expression profiling is the Affymetrix GeneChip. However, its selection of probes relied on earlier genome and transcriptome annotation which is significantly different from current knowledge. The resultant informatics problems have a profound impact on analysis and interpretation the data. Here, we address these critical issues and offer a solution. We identified several classes of problems at the individual probe level in the existing annotation, under the assumption that current genome and transcriptome databases are more accurate than those used for GeneChip design. We then reorganized probes on more than a dozen popular GeneChips into gene-, transcript- and exon-specific probe sets in light of up-to-date genome, cDNA/EST clustering and single nucleotide polymorphism information. Comparing analysis results between the original and the redefined probe sets reveals approximately 30-50% discrepancy in the genes previously identified as differentially expressed, regardless of analysis method. Our results demonstrate that the original Affymetrix probe set definitions are inaccurate, and many conclusions derived from past GeneChip analyses may be significantly flawed. It will be beneficial to re-analyze existing GeneChip data with updated probe set definitions.
Figures
Similar articles
-
The effect of GeneChip gene definitions on the microarray study of cancers.Bioessays. 2006 Jul;28(7):739-46. doi: 10.1002/bies.20433. Bioessays. 2006. PMID: 16850407
-
A verification protocol for the probe sequences of Affymetrix genome arrays reveals high probe accuracy for studies in mouse, human and rat.BMC Bioinformatics. 2007 Apr 20;8:132. doi: 10.1186/1471-2105-8-132. BMC Bioinformatics. 2007. PMID: 17448222 Free PMC article.
-
Transcript-level annotation of Affymetrix probesets improves the interpretation of gene expression data.BMC Bioinformatics. 2007 Jun 11;8:194. doi: 10.1186/1471-2105-8-194. BMC Bioinformatics. 2007. PMID: 17559689 Free PMC article.
-
[Transcriptome analyses and transcriptome databases].Tanpakushitsu Kakusan Koso. 2004 Aug;49(11 Suppl):1859-65. Tanpakushitsu Kakusan Koso. 2004. PMID: 15377029 Review. Japanese. No abstract available.
-
Normalization of microarray data: single-labeled and dual-labeled arrays.Mol Cells. 2006 Dec 31;22(3):254-61. Mol Cells. 2006. PMID: 17202852 Review.
Cited by
-
Aldh1L1 is expressed by postnatal neural stem cells in vivo.Glia. 2013 Sep;61(9):1533-41. doi: 10.1002/glia.22539. Epub 2013 Jul 8. Glia. 2013. PMID: 23836537 Free PMC article.
-
Patterns of methylation heritability in a genome-wide analysis of four brain regions.Nucleic Acids Res. 2013 Feb 1;41(4):2095-104. doi: 10.1093/nar/gks1449. Epub 2013 Jan 8. Nucleic Acids Res. 2013. PMID: 23303775 Free PMC article.
-
Intermittent energy restriction induces changes in breast gene expression and systemic metabolism.Breast Cancer Res. 2016 May 28;18(1):57. doi: 10.1186/s13058-016-0714-4. Breast Cancer Res. 2016. PMID: 27233359 Free PMC article.
-
Blood and urine multi-omics analysis of the impact of e-vaping, smoking, and cessation: from exposome to molecular responses.Sci Rep. 2024 Feb 21;14(1):4286. doi: 10.1038/s41598-024-54474-2. Sci Rep. 2024. PMID: 38383592 Free PMC article.
-
Chimeric EWSR1-FLI1 regulates the Ewing sarcoma susceptibility gene EGR2 via a GGAA microsatellite.Nat Genet. 2015 Sep;47(9):1073-8. doi: 10.1038/ng.3363. Epub 2015 Jul 27. Nat Genet. 2015. PMID: 26214589 Free PMC article.
References
-
- Bolstad B.M., Irizarry R.A., Astrand M., Speed T.P. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 2003;19:185–193. - PubMed
-
- Irizarry R.A., Hobbs B., Collin F., Beazer-Barclay Y.D., Antonellis K.J., Scherf U., Speed T.P. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003;4:249–264. - PubMed
-
- Lander E.S., Linton L.M., Birren B., Nusbaum C., Zody M.C., Baldwin J., Devon K., Dewar K., Doyle M., FitzHugh W., et al. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous
