Modeling the Influence of Language Input Statistics on Children's Speech Production

Ingeborg Roete; Stefan L Frank; Paula Fikkert; Marisa Casillas

doi:10.1111/cogs.12924

Modeling the Influence of Language Input Statistics on Children's Speech Production

Cogn Sci. 2020 Dec;44(12):e12924. doi: 10.1111/cogs.12924.

Authors

Ingeborg Roete^{1

2}, Stefan L Frank², Paula Fikkert², Marisa Casillas¹

Affiliations

¹ Language Development Department, Max Planck Institute for Psycholinguistics.
² Centre for Language Studies, Radboud University.

PMID: 33349953
DOI: 10.1111/cogs.12924

Abstract

We trained a computational model (the Chunk-Based Learner; CBL) on a longitudinal corpus of child-caregiver interactions in English to test whether one proposed statistical learning mechanism-backward transitional probability-is able to predict children's speech productions with stable accuracy throughout the first few years of development. We predicted that the model less accurately reconstructs children's speech productions as they grow older because children gradually begin to generate speech using abstracted forms rather than specific "chunks" from their speech environment. To test this idea, we trained the model on both recently encountered and cumulative speech input from a longitudinal child language corpus. We then assessed whether the model could accurately reconstruct children's speech. Controlling for utterance length and the presence of duplicate chunks, we found no evidence that the CBL becomes less accurate in its ability to reconstruct children's speech with age.

Keywords: Abstraction; Age invariance; CHILDES; Developmental trajectory; Language development; Statistical learning.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Aging
Caregivers / psychology
Child
Child Language*
Child, Preschool
Computer Simulation*
Humans
Infant
Infant, Newborn
Language Development*
Longitudinal Studies
Probability
Speech*