High-dimensional inference with the generalized Hopfield model: principal component analysis and corrections

Phys Rev E Stat Nonlin Soft Matter Phys. 2011 May;83(5 Pt 1):051123. doi: 10.1103/PhysRevE.83.051123. Epub 2011 May 20.

Abstract

We consider the problem of inferring the interactions between a set of N binary variables from the knowledge of their frequencies and pairwise correlations. The inference framework is based on the Hopfield model, a special case of the Ising model where the interaction matrix is defined through a set of patterns in the variable space, and is of rank much smaller than N. We show that maximum likelihood inference is deeply related to principal component analysis when the amplitude of the pattern components ξ is negligible compared to √N. Using techniques from statistical mechanics, we calculate the corrections to the patterns to the first order in ξ/√N. We stress the need to generalize the Hopfield model and include both attractive and repulsive patterns in order to correctly infer networks with sparse and strong interactions. We present a simple geometrical criterion to decide how many attractive and repulsive patterns should be considered as a function of the sampling noise. We moreover discuss how many sampled configurations are required for a good inference, as a function of the system size N and of the amplitude ξ. The inference approach is illustrated on synthetic and biological data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Likelihood Functions
  • Prefrontal Cortex / cytology
  • Principal Component Analysis*
  • Protein Structure, Tertiary
  • Proteins / chemistry
  • Proteins / metabolism
  • Rats

Substances

  • Proteins