Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 May 8;11(1):2313.
doi: 10.1038/s41467-020-15146-7.

Abstract representations of events arise from mental errors in learning and memory

Affiliations

Abstract representations of events arise from mental errors in learning and memory

Christopher W Lynn et al. Nat Commun. .

Abstract

Humans are adept at uncovering abstract associations in the world around them, yet the underlying mechanisms remain poorly understood. Intuitively, learning the higher-order structure of statistical relationships should involve complex mental processes. Here we propose an alternative perspective: that higher-order associations instead arise from natural errors in learning and memory. Using the free energy principle, which bridges information theory and Bayesian inference, we derive a maximum entropy model of people's internal representations of the transitions between stimuli. Importantly, our model (i) affords a concise analytic form, (ii) qualitatively explains the effects of transition network structure on human expectations, and (iii) quantitatively predicts human reaction times in probabilistic sequential motor tasks. Together, these results suggest that mental errors influence our abstract representations of the world in significant and predictable ways, with direct implications for the study and design of optimally learnable information sources.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Subjects respond to sequences of stimuli drawn as random walks on an underlying transition graph.
a Example sequence of visual stimuli (left) representing a random walk on an underlying transition network (right). b For each stimulus, subjects are asked to respond by pressing a combination of one or two buttons on a keyboard. c Each of the 15 possible button combinations corresponds to a node in the transition network. We only consider networks with nodes of uniform degree k = 4 and edges with uniform transition probability 0.25. d Subjects were asked to respond to sequences of 1500 such nodes drawn from two different transition architectures: a modular graph (left) and a lattice graph (right). e Average reaction times for the different button combinations, where the diagonal elements represent single-button presses and the off-diagonal elements represent two-button presses. f Average reaction times as a function of trial number, characterized by a steep drop-off in the first 500 trials followed by a gradual decline in the remaining 1000 trials. In e and f, averages are taken over responses during random walks on the modular and lattice graphs. Source data are provided as a Source Data file.
Fig. 2
Fig. 2. The effects of higher-order network structure on human reaction times.
a Cross-cluster surprisal effect in the modular graph, defined by an average increase in reaction times for between-cluster transitions (right) relative to within-cluster transitions (left). We detect significant differences in reaction times for random walks (p < 0.001, t = 5.77, df = 1.61 × 105) and Hamiltonian walks (p = 0.010, t = 2.59, df = 1.31 × 104). For the mixed effects models used to estimate these effects, see Supplementary Tables 1 and 3. b Modular-lattice effect, characterized by an overall increase in reaction times in the lattice graph (right) relative to the modular graph (left). We detect a significant difference in reaction times for random walks (p  < 0.001, t = 3.95, df = 3.33 × 105); see Supplementary Table 2 for the mixed effects model. Measurements were on independent subjects, statistical significance was computed using two-sided F-tests, and confidence intervals represent standard deviations. Source data are provided as a Source Data file.
Fig. 3
Fig. 3. A maximum entropy model of transition probability estimates in humans.
a Illustration of the maximum entropy distribution Pt) representing the probability of recalling a stimulus Δt time steps from the target stimulus (dashed line). In the limit β → 0, the distribution becomes uniform over all past stimuli (left). In the opposite limit β → ∞, the distribution becomes a delta function on the desired stimulus (right). For intermediate amounts of noise, the distribution drops off monotonically (center). b Resulting internal estimates  of the transition structure. For β → 0, the estimates become all-to-all, losing any resemblance to the true structure (left), while for β → ∞, the transition estimates become exact (right). At intermediate precision, the higher-order community structure organically comes into focus (center). c, d Predictions of the cross-cluster surprisal effect (c) and the modular-lattice effect (d) as functions of the inverse temperature β.
Fig. 4
Fig. 4. Predicting reaction times for individual subjects.
af Estimated parameters and accuracy analysis for our maximum entropy model across 358 random walk sequences (across 286 subjects; Methods). a For the inverse temperature β, 40 sequences corresponded to the limit β → ∞, 73 corresponded to the limit β → 0. Among the remaining 245 sequences, the average value of β was 0.30. b Distributions of the intercept r0 (left) and slope r1 (right). c Predicted reaction time as a function of a subject’s internal anticipation. Gray lines indicate 20 randomly selected sequences, and the red line shows the average prediction over all sequences. d Linear parameters for the third-order competing model; data points represent individual sequences and bars represent averages. e, f Comparing the performance of our maximum entropy model with the hierarchy of competing models up to third-order. Root mean squared error (RMSE; e) and Bayesian information criterion (BIC; f) of our model averaged over all sequences (dashed lines) compared to the competing models (solid lines); our model provides the best description of the data across all models considered. gj Estimated parameters and accuracy analysis for our maximum entropy model across all Hamiltonian walk sequences (120 subjects). g For the inverse temperature β, 20 subjects were best described as performing maximum likelihood estimation (β → ∞), 19 lacked any notion of the transition structure (β → 0), and the remaining 81 subjects had an average value of β = 0.61. h Distributions of the intercept r0 (left) and slope r1 (right). i Average RMSE of our model (dashed line) compared to that of the competing models (solid line); our model maintains higher accuracy than the competing hierarchy up to the second-order model. j Average BIC of the maximum entropy model (dashed line) compared to that of the competing models (solid line); our model provides a better description of the data than the second- or third-order models. Source data are provided as a Source Data file.
Fig. 5
Fig. 5. Measuring the memory distribution in an n-back experiment.
a Example of the 2-back memory task. Subjects view a sequence of stimuli (letters) and respond to each stimulus indicating whether it matches the target stimulus from two trials before. For each positive response that the current stimulus matches the target, we measure Δt by calculating the number of trials between the last instance of the current stimulus and the target. b Histograms of Δt (i.e., measurements of the memory distribution Pt)) across all subjects in the 1-, 2-, and 3-back tasks. Dashed lines indicate exponential fits to the observed distributions. The inverse temperature β is estimated for each task to be the negative slope of the exponential fit. c Memory distribution aggregated across the three n-back tasks. Dashed line indicates an exponential fit. We report a combined estimate of the inverse temperature β = 0.32 ± 0.01, where the standard deviation is estimated from 1000 bootstrap samples of the combined data. Measurements were on independent subjects. Source data are provided as a Source Data file.
Fig. 6
Fig. 6. Network violations yield surprise that grows with topological distance.
a Ring graph consisting of 15 nodes, where each node is connected to its nearest neighbors and next-nearest neighbors on the ring. Starting from the boxed node, a sequence can undergo a standard transition (green), a short violation of the transition structure (blue), or a long violation (red). b Our model predicts that subjects’ anticipations of both short (blue) and long (red) violations should be weaker than their anticipations of standard transitions (left). Furthermore, we predict that subjects’ anticipations of violations should decrease with increasing topological distance (right). c Average effects of network violations across 78 subjects, estimated using a mixed effects model (Supplementary Tables 10 and 11), with error bars indicating one standard deviation from the mean. We find that standard transitions yield quicker reactions than both short violations (p < 0.001, t = 4.50, df = 7.15 × 104) and long violations (p < 0.001, t = 8.07, df = 7.15 × 104). Moreover, topologically shorter violations induce faster reactions than long violations (p = 0.011, t = 2.54, df = 3.44 × 103), thus confirming the predictions of our model. Measurements were on independent subjects, and statistical significance was computed using two-sided F-tests. Source data are provided as a Source Data file.

Similar articles

Cited by

References

    1. Hyman R. Stimulus information as a determinant of reaction time. J. Exp. Psychol. 1953;45:188. doi: 10.1037/h0056940. - DOI - PubMed
    1. Sternberg S. Memory-scanning: mental processes revealed by reaction-time experiments. Am. Sci. 1969;57:421–457. - PubMed
    1. Johnson-Laird PN. Mental models in cognitive science. Cogn. Sci. 1980;4:71–115. doi: 10.1207/s15516709cog0401_4. - DOI
    1. Saffran JR, Aslin RN, Newport EL. Statistical learning by 8-month-old infants. Science. 1996;274:1926–1928. doi: 10.1126/science.274.5294.1926. - DOI - PubMed
    1. Bousfield WA. The occurrence of clustering in the recall of randomly arranged associates. J. Gen. Psychol. 1953;49:229–240. doi: 10.1080/00221309.1953.9710088. - DOI

Publication types