Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2016 Apr 26:6:25025.
doi: 10.1038/srep25025.

A specialized face-processing model inspired by the organization of monkey face patches explains several face-specific phenomena observed in humans

Affiliations
Comparative Study

A specialized face-processing model inspired by the organization of monkey face patches explains several face-specific phenomena observed in humans

Amirhossein Farzmahdi et al. Sci Rep. .

Abstract

Converging reports indicate that face images are processed through specialized neural networks in the brain -i.e. face patches in monkeys and the fusiform face area (FFA) in humans. These studies were designed to find out how faces are processed in visual system compared to other objects. Yet, the underlying mechanism of face processing is not completely revealed. Here, we show that a hierarchical computational model, inspired by electrophysiological evidence on face processing in primates, is able to generate representational properties similar to those observed in monkey face patches (posterior, middle and anterior patches). Since the most important goal of sensory neuroscience is linking the neural responses with behavioral outputs, we test whether the proposed model, which is designed to account for neural responses in monkey face patches, is also able to predict well-documented behavioral face phenomena observed in humans. We show that the proposed model satisfies several cognitive face effects such as: composite face effect and the idea of canonical face views. Our model provides insights about the underlying computations that transfer visual information from posterior to anterior face patches.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Schematic of the proposed model.
(A) Each block shows a layer of the model with their properties. S1 and C1 layers represent bars and edges similar to V1/V2 in the visual system. Face parts are represented through S2 and C2 layers. Subsequently, face views are coded in VSL and face identities are coded within the pattern of activities in ISL units (e.g. red circles for Identity 1 and blue circles for identity 2– different shades of red/blue indicate the level of activity). (B) Number of selected subjects in ISL during learning: The horizontal axis shows the number of ISL units (No. Subjects) and the vertical axis depicts the number of trials. The green curve shows the average of selected units across 10 random trials. (C) VISI and identification performance saturation during learning: The horizontal axis depicts the number of selected ISL units (No. Subjects) and the vertical axis illustrates performance and VISI. The pale curves indicate 10 random runs and the thick (blue and red) curves indicate the average.
Figure 2
Figure 2. Representational geometries of face views and identities in ISL, VSL, and C2.
Top row (A–C). Similarity matrices computed based on activities in ISL, VSL, and C2, from left to right, respectively. For each of these layers, a 100 × 100 similarity matrix was constructed by calculate the pairwise correlation (Pearson’s correlation) between the extracted feature vectors for 10 sample subjects in 10 sample face viewpoints (viewpoints are in the steps of 20° from −90° to 90°). Bottom row (D–F): Each panel depicts the results of multidimensional scaling (MDS) for responses to the face images in different layers (D: ISL, E: VSL, and F: C2). Each plot shows the location of 10 subjects (indicated by numbers from 1 to 10) at 10 face views (indicated by 10 different colors, shown in the right inset) for the first two dimensions of the MDS space. Note that the clusters of the face views and face identities are formed in VSL and ISL, respectively. (G) The method for calculating view selectivity index (VSI) and view-invariant identity selectivity index (VISI) are shown in this part. Pale blue values are divided to dark blue values. The diagonal line is omitted from the calculations. (H) VISI is significantly higher in ISL compared to VSL and C2 (ranksum test, p = 0.001). Face views are better decoded in VSL compared to ISL and C2 layers.
Figure 3
Figure 3. Higher degree of invariance (DoI) in ISL compared to C2.
(A) View-tolerance at the level of C2 units. Each tuning curve shows the degree of invariance in the responses of C2 units for a particular viewing angle (face view). Only a subset of tuning curves is presented (details for every view is shown in Supplementary Fig. S1). The vertical axis is the correlation between feature vectors at one reference view from a set of subjects and feature vectors, computed for the same subjects across different view. The horizontal axis indicates different views with the steps of 5°. The colored, horizontal lines underneath each curve demonstrate the significant range of DoI (p < 0.02– ranksum test) for a particular view. Each row in the invariance matrix, below the tuning curves, corresponds to a tuning curve for a face viewpoint (viewing angles are separated by 5°, from −90° in the first row to +90° in the last row. Head poses and camera position are schematically shown along the horizontal axis). Color bar at right inset represents the range of correlation. The gray horizontal lines, printed on the invariance matrix, exhibit the degree of invariance for every view similar to tuning curves (ranksum test). (B) View tolerance at the level of ISLs. (C) Summary of view tolerance responses for each face view in C2 units and ISLs. Each bar exhibits the DoI for a face view for C2 units (red bars) and ISLs (blue bars). The horizontal axis shows different face views. (D) Average DoI across all views for ISL and C2, calculated using data shown in (C).
Figure 4
Figure 4. Performance of the model (ISL) in view invariant face recognition.
(A) The performance of face identification in different views. The color-coded matrix shows the performance of the model in identifying subjects across different views. Each row of the performance matrix illustrates the performance of the model for one view (trained using a particular view and tested over all views). The color bar at the top-left shows the range of identification performance. The vertical axis shows different face views for training. The horizontal axis corresponds to different test views, the first row of the matrix shows that a classifier trained by −90° and tested with all other views. The chance level is 5%. A subset of performance curves is shown at the right inset, demonstrating the performance variations in different views, the peaks of performance curves change as the training views change (details of performances in every view are shown in Supplementary Fig. S2). The small, black, vertical axes at the right of the curves show 20% performance. Error bars are standard deviations over 10 runs. (B) Performance comparison across different views. Each circle refers to the average of recognition rate in each view (i.e. the mean performance across all views). The vertical axis indicates the mean performance and the horizontal axis shows different views. Several performance curves are shown for some sample views. Error bars are the standard deviation and the performances are the average of 10 runs.
Figure 5
Figure 5. Face inversion effect (FIE) for different views.
(A) The distance between feature vectors of inverted and upright face images for C2 units (up) and ISL (down). Inversion effect is highly significant at ISL compared to the C2 layer (normalized Euclidean distance). The vertical axis indicates the normalized distance and the horizontal axis shows different views, separated with the steps of 5°. The cyan bars represent the results for upright face images and the purple bars show the results for inverted face images. (B) MDS similarity matrices in the ISL upright (left) and inverted (right) faces. Similarity matrices show the pairwise similarities between the internal representations of the model for two different face views. The diagonal, parallel lines in the similarity matrix for upright faces (left) indicate the identity selectivity in the ISL for upright faces. The similarity matrix for inverted face images is shown at right. The lines along horizontal and vertical axis indicate different face views. Left MDS shows the results for upright faces while the right MDS represents the results for inverted faces. Color-coded circles in the MDS space represent subjects (10 subjects) at eight different views. (C) VISI for upright and inverted faces in the model (ranksum test –see Materials and Methods). Error bars are the standard deviation (STD) obtained over 10 independent runs. (D) Discriminability score (i.e. z-scored mean pairwise Euclidean distance between identities in the same in-plane rotation) is computed between feature vectors of images for C2 units (left) and ISL (right). The vertical axis indicates the discriminability score and the horizontal axis shows different plane-rotations, separated with the steps of 5°. Sample plane rotations of a schematic face are shown at the top of C2 units’ responses. Stars show significant discriminability scores (one-sided ranksum test, FDR corrected at 0.05).
Figure 6
Figure 6. Model responses in the aligned vs. misaligned face identification task (Composite Face Effect).
(A) The hit rate in identification of aligned (purple) and misaligned (red) faces in the C2 layer. The vertical axis shows the hit rate while the horizontal axis shows the threshold range (see Material and Method). Several samples of aligned (purple frames) and misaligned (red frames) face images are shown at the top of the plot. Two sample bar plots are shown at the right inset for two different thresholds: 0.5 (gray background) and ~1 (green background). The blue region is the area in which the hit rate between aligned and misaligned faces is significantly different (ranksum test). (B) The hit rate in identification of aligned and misaligned faces in ISL. In both A and B each point corresponds to the hit rate for the threshold value shown on the X-axis (different thresholds specify the boundary of the model to consider two face images as the same identity, 0 < thr < 1).
Figure 7
Figure 7. Discriminability of ISL units in response to Asian and Caucasian faces.
(A) The dissimilarity (right- calculated based on Euclidean distance) and performance (left) between feature vectors of different races (using ISL features). A typical other-race effect can be seen, as observed in face recognition tasks in behavioral studies. ORE is highly significant in ISL. The model was trained using images from NCKU dataset (Asian race) and tested using Asian and Caucasian images from Tarr dataset. The vertical axes indicate identification performance (left) and dissimilarity calculated based on normalized Euclidean distance (right). The blue bar indicates the results for Asian face images and the red bar shows the results for Caucasian face images. (B) The dissimilarity (right) and performance (left) between feature vectors of different races in ISL when the model was trained on Caucasian faces and tested using both Asian and Caucasian (Tarr dataset). In all plots error bars are the standard deviation obtained over 10 runs. P-values calculated using ranksum test.

Similar articles

Cited by

References

    1. Perrett D. I., Hietanen J. K., Oram M. W., Benson P. J. & Rolls E. T. Organization and functions of cells responsive to faces in the temporal cortex [and discussion]. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 335, 23–30 (1992). - PubMed
    1. Tsao D. Y., Freiwald W. A., Knutsen T. A., Mandeville J. B. & Tootell R. B. Faces and objects in macaque cerebral cortex. Nat. Neurosci. 6, 989–995 (2003). - PMC - PubMed
    1. Moeller S., Freiwald W. A. & Tsao D. Y. Patches with links: a unified system for processing faces in the macaque temporal lobe. Science 320, 1355–1359 (2008). - PMC - PubMed
    1. Freiwald W. A. & Tsao D. Y. Functional compartmentalization and viewpoint generalization within the macaque face-processing system. Science 330, 845–851 (2010). - PMC - PubMed
    1. Kanwisher N., McDermott J. & Chun M. M. The fusiform face area: a module in human extrastriate cortex specialized for face perception. J. Neurosci. 17, 4302–4311 (1997). - PMC - PubMed

Publication types

LinkOut - more resources