Inferring TF activation order in time series scRNA-Seq studies

PLoS Comput Biol. 2020 Feb 18;16(2):e1007644. doi: 10.1371/journal.pcbi.1007644. eCollection 2020 Feb.


Methods for the analysis of time series single cell expression data (scRNA-Seq) either do not utilize information about transcription factors (TFs) and their targets or only study these as a post-processing step. Using such information can both, improve the accuracy of the reconstructed model and cell assignments, while at the same time provide information on how and when the process is regulated. We developed the Continuous-State Hidden Markov Models TF (CSHMM-TF) method which integrates probabilistic modeling of scRNA-Seq data with the ability to assign TFs to specific activation points in the model. TFs are assumed to influence the emission probabilities for cells assigned to later time points allowing us to identify not just the TFs controlling each path but also their order of activation. We tested CSHMM-TF on several mouse and human datasets. As we show, the method was able to identify known and novel TFs for all processes, assigned time of activation agrees with both expression information and prior knowledge and combinatorial predictions are supported by known interactions. We also show that CSHMM-TF improves upon prior methods that do not utilize TF-gene interaction.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Animals
  • Chromatin Immunoprecipitation
  • Computational Biology
  • Databases, Factual
  • Gene Expression Profiling
  • Humans
  • Liver / metabolism
  • Lung / metabolism
  • Markov Chains
  • Mice
  • Models, Statistical
  • Probability
  • RNA, Small Cytoplasmic / metabolism*
  • RNA-Seq*
  • Single-Cell Analysis*
  • Transcription Factors / metabolism*
  • Transcription, Genetic


  • RNA, Small Cytoplasmic
  • Transcription Factors