Extended High Frequencies Provide Both Spectral and Temporal Information to Improve Speech-in-Speech Recognition

Allison Trine; Brian B Monson

doi:10.1177/2331216520980299

Extended High Frequencies Provide Both Spectral and Temporal Information to Improve Speech-in-Speech Recognition

Trends Hear. 2020 Jan-Dec:24:2331216520980299. doi: 10.1177/2331216520980299.

Authors

Allison Trine¹, Brian B Monson^{1

2}

Affiliations

¹ Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign, United States.
² Neuroscience Program, University of Illinois at Urbana-Champaign, Champaign, United States.

Abstract

Several studies have demonstrated that extended high frequencies (EHFs; >8 kHz) in speech are not only audible but also have some utility for speech recognition, including for speech-in-speech recognition when maskers are facing away from the listener. However, the contribution of EHF spectral versus temporal information to speech recognition is unknown. Here, we show that access to EHF temporal information improved speech-in-speech recognition relative to speech bandlimited at 8 kHz but that additional access to EHF spectral detail provided an additional small but significant benefit. Results suggest that both EHF spectral structure and the temporal envelope contribute to the observed EHF benefit. Speech recognition performance was quite sensitive to masker head orientation, with a rotation of only 15° providing a highly significant benefit. An exploratory analysis indicated that pure-tone thresholds at EHFs are better predictors of speech recognition performance than low-frequency pure-tone thresholds.

Keywords: head orientation; speech in noise; speech perception.

MeSH terms

Audiometry, Pure-Tone
Auditory Threshold
Humans
Noise
Perceptual Masking
Speech Perception*
Speech*