Discovering monotonic stemness marker genes from time-series stem cell microarray data

BMC Genomics. 2015;16 Suppl 2(Suppl 2):S2. doi: 10.1186/1471-2164-16-S2-S2. Epub 2015 Jan 21.


Background: Identification of genes with ascending or descending monotonic expression patterns over time or stages of stem cells is an important issue in time-series microarray data analysis. We propose a method named Monotonic Feature Selector (MFSelector) based on a concept of total discriminating error (DEtotal) to identify monotonic genes. MFSelector considers various time stages in stage order (i.e., Stage One vs. other stages, Stages One and Two vs. remaining stages and so on) and computes DEtotal of each gene. MFSelector can successfully identify genes with monotonic characteristics.

Results: We have demonstrated the effectiveness of MFSelector on two synthetic data sets and two stem cell differentiation data sets: embryonic stem cell neurogenesis (ESCN) and embryonic stem cell vasculogenesis (ESCV) data sets. We have also performed extensive quantitative comparisons of the three monotonic gene selection approaches. Some of the monotonic marker genes such as OCT4, NANOG, BLBP, discovered from the ESCN dataset exhibit consistent behavior with that reported in other studies. The role of monotonic genes found by MFSelector in either stemness or differentiation is validated using information obtained from Gene Ontology analysis and other literature. We justify and demonstrate that descending genes are involved in the proliferation or self-renewal activity of stem cells, while ascending genes are involved in differentiation of stem cells into variant cell lineages.

Conclusions: We have developed a novel system, easy to use even with no pre-existing knowledge, to identify gene sets with monotonic expression patterns in multi-stage as well as in time-series genomics matrices. The case studies on ESCN and ESCV have helped to get a better understanding of stemness and differentiation. The novel monotonic marker genes discovered from a data set are found to exhibit consistent behavior in another independent data set, demonstrating the utility of the proposed method. The MFSelector R function and data sets can be downloaded from:

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cell Differentiation / genetics
  • Cell Lineage / genetics
  • Cluster Analysis
  • Computational Biology / methods*
  • Gene Expression Profiling / methods*
  • Homeodomain Proteins / genetics
  • Humans
  • Internet
  • Nanog Homeobox Protein
  • Neovascularization, Physiologic / genetics
  • Neurogenesis / genetics
  • Octamer Transcription Factor-3 / genetics
  • Oligonucleotide Array Sequence Analysis / methods*
  • Stem Cells / cytology
  • Stem Cells / metabolism*
  • Time Factors


  • Homeodomain Proteins
  • NANOG protein, human
  • Nanog Homeobox Protein
  • Octamer Transcription Factor-3
  • POU5F1 protein, human