Pattern identification in time-course gene expression data with the CoGAPS matrix factorization

Methods Mol Biol. 2014;1101:87-112. doi: 10.1007/978-1-62703-721-1_6.


Patterns in time-course gene expression data can represent the biological processes that are active over the measured time period. However, the orthogonality constraint in standard pattern-finding algorithms, including notably principal components analysis (PCA), confounds expression changes resulting from simultaneous, non-orthogonal biological processes. Previously, we have shown that Markov chain Monte Carlo nonnegative matrix factorization algorithms are particularly adept at distinguishing such concurrent patterns. One such matrix factorization is implemented in the software package CoGAPS. We describe the application of this software and several technical considerations for identification of age-related patterns in a public, prefrontal cortex gene expression dataset.

MeSH terms

  • Gene Expression Profiling*
  • Humans
  • Markov Chains
  • Molecular Sequence Annotation
  • Monte Carlo Method
  • Pattern Recognition, Automated
  • Principal Component Analysis
  • Software*
  • Transcriptome