Dissecting cell identity via network inference and in silico gene perturbation

Nature. 2023 Feb;614(7949):742-751. doi: 10.1038/s41586-022-05688-9. Epub 2023 Feb 8.


Cell identity is governed by the complex regulation of gene expression, represented as gene-regulatory networks1. Here we use gene-regulatory networks inferred from single-cell multi-omics data to perform in silico transcription factor perturbations, simulating the consequent changes in cell identity using only unperturbed wild-type data. We apply this machine-learning-based approach, CellOracle, to well-established paradigms-mouse and human haematopoiesis, and zebrafish embryogenesis-and we correctly model reported changes in phenotype that occur as a result of transcription factor perturbation. Through systematic in silico transcription factor perturbation in the developing zebrafish, we simulate and experimentally validate a previously unreported phenotype that results from the loss of noto, an established notochord regulator. Furthermore, we identify an axial mesoderm regulator, lhx1a. Together, these results show that CellOracle can be used to analyse the regulation of cell identity by transcription factors, and can provide mechanistic insights into development and differentiation.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Cell Differentiation* / genetics
  • Computer Simulation*
  • Embryonic Development / genetics
  • Gene Regulatory Networks*
  • Hematopoiesis / genetics
  • Humans
  • Mesoderm / enzymology
  • Mesoderm / metabolism
  • Mice
  • Phenotype
  • Transcription Factors* / metabolism
  • Zebrafish / embryology
  • Zebrafish / genetics


  • Transcription Factors
  • noto protein, zebrafish
  • Lhx1a protein, zebrafish