Multi-platform gene-expression mining and marker gene analysis
- PMID: 22145530
- DOI: 10.1504/ijdmb.2011.043030
Multi-platform gene-expression mining and marker gene analysis
Abstract
Gene-expression data are now widely available and used for a wide range of clinical and diagnostic purposes. A key challenge is to select a few significant marker genes for biological studies. While it is feasible to find important genes from a single gene-expression data set, it is often more meaningful to compare the results from different but related data sets together, especially for multiple gene-expression data sets arising from different studies of a common organism or phenotype. In this paper, we present a novel framework to exploit the commonalities across different data sets by jointly learning from different data sets simultaneously through multi-task feature learning. By identifying a common subspace of genes, we can help biologists find important marker genes that span different evolutionary periods in the life cycle of cancer development. The genes thus found are more stable and more significant. Our experimental results demonstrate that more accurate models can be built using multiple data sets based on fewer labelled examples. To the best of our knowledge, we are among the first to introduce multi-task learning in the bioinformatics community to solve the lack of data problem.
Similar articles
-
An optimised gene selection approach using wavelet power spectrum.Int J Bioinform Res Appl. 2011;7(4):335-54. doi: 10.1504/IJBRA.2011.043767. Int J Bioinform Res Appl. 2011. PMID: 22112527
-
MiningABs: mining associated biomarkers across multi-connected gene expression datasets.BMC Bioinformatics. 2014 Jun 8;15:173. doi: 10.1186/1471-2105-15-173. BMC Bioinformatics. 2014. PMID: 24909518 Free PMC article.
-
Bi-k-bi clustering: mining large scale gene expression data using two-level biclustering.Int J Data Min Bioinform. 2010;4(6):701-21. doi: 10.1504/ijdmb.2010.037548. Int J Data Min Bioinform. 2010. PMID: 21355502
-
Literature-aided interpretation of gene expression data with the weighted global test.Brief Bioinform. 2011 Sep;12(5):518-29. doi: 10.1093/bib/bbq082. Epub 2010 Dec 22. Brief Bioinform. 2011. PMID: 21183478 Review.
-
Mining gene expression profiles: expression signatures as cancer phenotypes.Nat Rev Genet. 2007 Aug;8(8):601-9. doi: 10.1038/nrg2137. Epub 2007 Jul 3. Nat Rev Genet. 2007. PMID: 17607306 Review.
Cited by 4 articles
-
Prediction of deleterious mutations in coding regions of mammals with transfer learning.Evol Appl. 2018 May 9;12(1):18-28. doi: 10.1111/eva.12607. eCollection 2019 Jan. Evol Appl. 2018. PMID: 30622632 Free PMC article.
-
Comparative Evaluation of Machine Learning Strategies for Analyzing Big Data in Psychiatry.Int J Mol Sci. 2018 Oct 29;19(11):3387. doi: 10.3390/ijms19113387. Int J Mol Sci. 2018. PMID: 30380679 Free PMC article.
-
Clinical and molecular models of glioblastoma multiforme survival.Int J Data Min Bioinform. 2013;7(3):245-65. doi: 10.1504/ijdmb.2013.053310. Int J Data Min Bioinform. 2013. PMID: 23819258 Free PMC article.
-
Multi-task learning for cross-platform siRNA efficacy prediction: an in-silico study.BMC Bioinformatics. 2010 Apr 10;11:181. doi: 10.1186/1471-2105-11-181. BMC Bioinformatics. 2010. PMID: 20380733 Free PMC article.