Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 452 (7185), 352-5

Identifying Natural Images From Human Brain Activity


Identifying Natural Images From Human Brain Activity

Kendrick N Kay et al. Nature.


A challenging goal in neuroscience is to be able to read out, or decode, mental content from brain activity. Recent functional magnetic resonance imaging (fMRI) studies have decoded orientation, position and object category from activity in visual cortex. However, these studies typically used relatively simple stimuli (for example, gratings) or images drawn from fixed categories (for example, faces, houses), and decoding was based on previous measurements of brain activity evoked by those same stimuli or categories. To overcome these limitations, here we develop a decoding method based on quantitative receptive-field models that characterize the relationship between visual stimuli and fMRI activity in early visual areas. These models describe the tuning of individual voxels for space, orientation and spatial frequency, and are estimated directly from responses evoked by natural images. We show that these receptive-field models make it possible to identify, from a large set of completely novel natural images, which specific image was seen by an observer. Identification is not a mere consequence of the retinotopic organization of visual areas; simpler receptive-field models that describe only spatial tuning yield much poorer identification performance. Our results suggest that it may soon be possible to reconstruct a picture of a person's visual experience from measurements of brain activity alone.

Conflict of interest statement

The authors declare no competing financial interests.


Fig. 1
Fig. 1. Schematic of experiment
The experiment consisted of two stages. In the first stage, model estimation, fMRI data were recorded while each subject viewed a large collection of natural images. These data were used to estimate a quantitative receptive field model for each voxel. The model was based on a Gabor wavelet pyramid and described tuning along the dimensions of space,, orientation,,, and spatial frequency,. In the second stage, image identification, fMRI data were recorded while each subject viewed a collection of novel natural images. For each measurement of brain activity, we attempted to identify which specific image had been seen. This was accomplished by using the estimated receptive field models to predict brain activity for a set of potential images and then selecting the image whose predicted activity most closely matches the measured activity.
Fig. 2
Fig. 2. Receptive field model for a representative voxel
a, Spatial envelope. The intensity of each pixel indicates the sensitivity of the receptive field (RF) to that location. The white circle delineates the bounds of the stimulus (20° × 20°) and the green square delineates the estimated RF location. Horizontal and vertical slices through the spatial envelope are shown below and to the left. These intersect the peak of the spatial envelope, as indicated by yellow tick marks. The thickness of each slice profile indicates ± 1 s.e.m. This RF is located in the left hemifield, just below the horizontal meridian. b, Orientation and spatial frequency tuning curves. The top matrix depicts the joint orientation and spatial frequency tuning of the RF, and the bottom two plots give the marginal orientation and spatial frequency tuning curves. Error bars indicate ± 1 s.e.m. This RF has broadband orientation tuning and high-pass spatial frequency tuning. For additional RF examples and population summaries of RF properties, see Supplementary Figs. 9 11.
Fig. 3
Fig. 3. Identification performance
In the image identification stage of the experiment, fMRI data were recorded while each subject viewed 120 novel natural images that had not been used to estimate the receptive field models. For each of the 120 measured voxel activity patterns we attempted to identify which image had been seen. This figure illustrates identification performance for one subject (S1). The color at the mth column and nth row represents the correlation between the measured voxel activity pattern for the mth image and the predicted voxel activity pattern for the nth image. The highest correlation in each column is designated by an enlarged dot of the appropriate color, and indicates the image selected by the identification algorithm. For this subject 92% (110/120) of the images were identified correctly.
Fig. 4
Fig. 4. Factors that impact identification performance
a, Summary of identification performance. The bars indicate empirical performance for a set size of 120 images, the marker above each bar indicates the estimated noise ceiling (i.e. the theoretical maximum performance given the level of noise in the data), and the dashed green line indicates chance performance. The noise ceiling estimates suggest that the difference in performance across subjects is due to intrinsic differences in the level of noise. b, Scaling of identification performance with set size. The x-axis indicates set size, the y-axis indicates identification performance, and the number to the right of each line gives the estimated set size at which performance declines to 10% correct. In all cases performance scaled very well with set size. c, Retinotopy-only model versus Gabor wavelet pyramid model. Identification was attempted using an alternative retinotopy-only model that captures only the location and size of each voxel’s receptive field. This model performed substantially worse than the Gabor wavelet pyramid model, indicating that spatial tuning alone is insufficient to achieve optimal identification performance. (Results reflect repeated-trial performance averaged across subjects; see Supplementary Fig. 5 for detailed results.)

Similar articles

See all similar articles

Cited by 304 articles

See all "Cited by" articles

Publication types