View-based models of 3D object recognition: invariance to imaging transformations

Cereb Cortex. May-Jun 1995;5(3):261-9. doi: 10.1093/cercor/5.3.261.


This report describes the main features of a view-based model of object recognition. The model does not attempt to account for specific cortical structures; it tries to capture general properties to be expected in a biological architecture for object recognition. The basic module is a regularization network (RBF-like; see Poggio and Girosi, 1989; Poggio, 1990) in which each of the hidden units is broadly tuned to a specific view of the object to be recognized. The network output, which may be largely view independent, is first described in terms of some simple simulations. The following refinements and details of the basic module are then discussed: (1) some of the units may represent only components of views of the object--the optimal stimulus for the unit, its "center," is effectively a complex feature; (2) the units' properties are consistent with the usual description of cortical neurons as tuned to multidimensional optimal stimuli and may be realized in terms of plausible biophysical mechanisms; (3) in learning to recognize new objects, preexisting centers may be used and modified, but also new centers may be created incrementally so as to provide maximal view invariance; (4) modules are part of a hierarchical structure--the output of a network may be used as one of the inputs to another, in this way synthesizing increasingly complex features and templates; (5) in several recognition tasks, in particular at the basic level, a single center using view-invariant features may be sufficient.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Depth Perception*
  • Form Perception*
  • Humans
  • Mathematics
  • Memory
  • Models, Neurological*
  • Psychophysics