Modeling the latent dimensions of multivariate signaling datasets

Phys Biol. 2012 Aug;9(4):045004. doi: 10.1088/1478-3975/9/4/045004. Epub 2012 Aug 7.


Cellular signal transduction is coordinated by modifications of many proteins within cells. Protein modifications are not independent, because some are connected through shared signaling cascades and others jointly converge upon common cellular functions. This coupling creates a hidden structure within a signaling network that can point to higher level organizing principles of interest to systems biology. One can identify important covariations within large-scale datasets by using mathematical models that extract latent dimensions-the key structural elements of a measurement set. In this paper, we introduce two principal component-based methods for identifying and interpreting latent dimensions. Principal component analysis provides a starting point for unbiased inspection of the major sources of variation within a dataset. Partial least-squares regression reorients these dimensions toward a specific hypothesis of interest. Both approaches have been used widely in studies of cell signaling, and they should be standard analytical tools once highly multivariate datasets become straightforward to accumulate.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Computer Simulation*
  • Humans
  • Least-Squares Analysis
  • Models, Biological*
  • Principal Component Analysis
  • Proteins / metabolism
  • Signal Transduction*
  • Systems Biology


  • Proteins