Distributional regularity and phonotactic constraints are useful for segmentation

Cognition. Oct-Nov 1996;61(1-2):93-125. doi: 10.1016/s0010-0277(96)00719-6.


In order to acquire a lexicon, young children must segment speech into words, even though most words are unfamiliar to them. This is a non-trivial task because speech lacks any acoustic analog of the blank spaces between printed words. Two sources of information that might be useful for this task are distributional regularity and phonotactic constraints. Informally, distributional regularity refers to the intuition that sound sequences that occur frequently and in a variety of contexts are better candidates for the lexicon than those that occur rarely or in few contexts. We express that intuition formally by a class of functions called DR functions. We then put forth three hypotheses: First, that children segment using DR functions. Second, that they exploit phonotactic constraints on the possible pronunciations of words in their language. Specifically, they exploit both the requirement that every word must have a vowel and the constraints that languages impose on word-initial and word-final consonant clusters. Third, that children learn which word-boundary clusters are permitted in their language by assuming that all permissible word-boundary clusters will eventually occur at utterance boundaries. Using computational simulation, we investigate the effectiveness of these strategies for segmenting broad phonetic transcripts of child-directed English. The results show that DR functions and phonotactic constraints can be used to significantly improve segmentation. Further, the contributions of DR functions and phonotactic constraints are largely independent, so using both yields better segmentation than using either one alone. Finally, learning the permissible word-boundary clusters from utterance boundaries does not degrade segmentation performance.

MeSH terms

  • Adult
  • Algorithms
  • Attention
  • Child
  • Child, Preschool
  • Female
  • Humans
  • Infant
  • Language Development*
  • Male
  • Mental Recall
  • Phonetics*
  • Verbal Learning*
  • Vocabulary