Modeling the Influence of Language Input Statistics on Children's Speech Production

Cogn Sci. 2020 Dec;44(12):e12924. doi: 10.1111/cogs.12924.

Abstract

We trained a computational model (the Chunk-Based Learner; CBL) on a longitudinal corpus of child-caregiver interactions in English to test whether one proposed statistical learning mechanism-backward transitional probability-is able to predict children's speech productions with stable accuracy throughout the first few years of development. We predicted that the model less accurately reconstructs children's speech productions as they grow older because children gradually begin to generate speech using abstracted forms rather than specific "chunks" from their speech environment. To test this idea, we trained the model on both recently encountered and cumulative speech input from a longitudinal child language corpus. We then assessed whether the model could accurately reconstruct children's speech. Controlling for utterance length and the presence of duplicate chunks, we found no evidence that the CBL becomes less accurate in its ability to reconstruct children's speech with age.

Keywords: Abstraction; Age invariance; CHILDES; Developmental trajectory; Language development; Statistical learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aging
  • Caregivers / psychology
  • Child
  • Child Language*
  • Child, Preschool
  • Computer Simulation*
  • Humans
  • Infant
  • Infant, Newborn
  • Language Development*
  • Longitudinal Studies
  • Probability
  • Speech*