Integrative framework for identification of key cell identity genes uncovers determinants of ES cell identity and homeostasis

Proc Natl Acad Sci U S A. 2014 Apr 22;111(16):E1581-90. doi: 10.1073/pnas.1318598111. Epub 2014 Apr 7.


Identification of genes associated with specific biological phenotypes is a fundamental step toward understanding the molecular basis underlying development and pathogenesis. Although RNAi-based high-throughput screens are routinely used for this task, false discovery and sensitivity remain a challenge. Here we describe a computational framework for systematic integration of published gene expression data to identify genes defining a phenotype of interest. We applied our approach to rank-order all genes based on their likelihood of determining ES cell (ESC) identity. RNAi-mediated loss-of-function experiments on top-ranked genes unearthed many novel determinants of ESC identity, thus validating the derived gene ranks to serve as a rich and valuable resource for those working to uncover novel ESC regulators. Underscoring the value of our gene ranks, functional studies of our top-hit Nucleolin (Ncl), abundant in stem and cancer cells, revealed Ncl's essential role in the maintenance of ESC homeostasis by shielding against differentiation-inducing redox imbalance-induced oxidative stress. Notably, we report a conceptually novel mechanism involving a Nucleolin-dependent Nanog-p53 bistable switch regulating the homeostatic balance between self-renewal and differentiation in ESCs. Our findings connect the dots on a previously unknown regulatory circuitry involving genes associated with traits in both ESCs and cancer and might have profound implications for understanding cell fate decisions in cancer stem cells. The proposed computational framework, by helping to prioritize and preselect candidate genes for tests using complex and expensive genetic screens, provides a powerful yet inexpensive means for identification of key cell identity genes.

Keywords: RNA-binding protein; ROS; computational biology; pluripotency; transcription.

Publication types

  • Research Support, N.I.H., Intramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Cell Differentiation / genetics
  • Cell Proliferation
  • Embryonic Stem Cells / cytology*
  • Embryonic Stem Cells / metabolism*
  • Gene Expression Regulation
  • Homeodomain Proteins / metabolism
  • Homeostasis / genetics*
  • Mice
  • Nanog Homeobox Protein
  • Oxidative Stress / genetics
  • Phosphoproteins / genetics
  • Phosphoproteins / metabolism
  • Pluripotent Stem Cells / cytology
  • Pluripotent Stem Cells / metabolism
  • RNA Interference
  • RNA-Binding Proteins / genetics
  • RNA-Binding Proteins / metabolism
  • Reactive Oxygen Species / metabolism
  • Reproducibility of Results
  • Transcription, Genetic
  • Tumor Suppressor Protein p53 / metabolism


  • Homeodomain Proteins
  • Nanog Homeobox Protein
  • Nanog protein, mouse
  • Phosphoproteins
  • RNA-Binding Proteins
  • Reactive Oxygen Species
  • Tumor Suppressor Protein p53
  • nucleolin

Associated data

  • GEO/GSE47872