Temporal modulations in speech and music

Neurosci Biobehav Rev. 2017 Oct;81(Pt B):181-187. doi: 10.1016/j.neubiorev.2017.02.011. Epub 2017 Feb 14.


Speech and music have structured rhythms. Here we discuss a major acoustic correlate of spoken and musical rhythms, the slow (0.25-32Hz) temporal modulations in sound intensity and compare the modulation properties of speech and music. We analyze these modulations using over 25h of speech and over 39h of recordings of Western music. We show that the speech modulation spectrum is highly consistent across 9 languages (including languages with typologically different rhythmic characteristics). A different, but similarly consistent modulation spectrum is observed for music, including classical music played by single instruments of different types, symphonic, jazz, and rock. The temporal modulations of speech and music show broad but well-separated peaks around 5 and 2Hz, respectively. These acoustically dominant time scales may be intrinsic features of speech and music, a possibility which should be investigated using more culturally diverse samples in each domain. Distinct modulation timescales for speech and music could facilitate their perceptual analysis and its neural processing.

Keywords: Modulation spectrum; Music; Rhythm; Speech; Temporal modulations.

Publication types

  • Review

MeSH terms

  • Auditory Perception
  • Humans
  • Language
  • Music*
  • Periodicity
  • Sound Spectrography
  • Speech Acoustics*
  • Speech Perception
  • Time Factors