Auditory Time-Frequency Masking for Spectrally and Temporally Maximally-Compact Stimuli

Thibaud Necciari; Bernhard Laback; Sophie Savel; Sølvi Ystad; Peter Balazs; Sabine Meunier; Richard Kronland-Martinet

doi:10.1371/journal.pone.0166937

Auditory Time-Frequency Masking for Spectrally and Temporally Maximally-Compact Stimuli

PLoS One. 2016 Nov 22;11(11):e0166937. doi: 10.1371/journal.pone.0166937. eCollection 2016.

Authors

Thibaud Necciari¹, Bernhard Laback¹, Sophie Savel², Sølvi Ystad², Peter Balazs¹, Sabine Meunier², Richard Kronland-Martinet²

Affiliations

¹ Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria.
² Laboratoire de Mécanique et d'Acoustique, CNRS UPR 7051, Equipe Sons, Aix-Marseille Université, Centrale Marseille, Marseille, France.

Abstract

Many audio applications perform perception-based time-frequency (TF) analysis by decomposing sounds into a set of functions with good TF localization (i.e. with a small essential support in the TF domain) using TF transforms and applying psychoacoustic models of auditory masking to the transform coefficients. To accurately predict masking interactions between coefficients, the TF properties of the model should match those of the transform. This involves having masking data for stimuli with good TF localization. However, little is known about TF masking for mathematically well-localized signals. Most existing masking studies used stimuli that are broad in time and/or frequency and few studies involved TF conditions. Consequently, the present study had two goals. The first was to collect TF masking data for well-localized stimuli in humans. Masker and target were 10-ms Gaussian-shaped sinusoids with a bandwidth of approximately one critical band. The overall pattern of results is qualitatively similar to existing data for long maskers. To facilitate implementation in audio processing algorithms, a dataset provides the measured TF masking function. The second goal was to assess the potential effect of auditory efferents on TF masking using a modeling approach. The temporal window model of masking was used to predict present and existing data in two configurations: (1) with standard model parameters (i.e. without efferents), (2) with cochlear gain reduction to simulate the activation of efferents. The ability of the model to predict the present data was quite good with the standard configuration but highly degraded with gain reduction. Conversely, the ability of the model to predict existing data for long maskers was better with than without gain reduction. Overall, the model predictions suggest that TF masking can be affected by efferent (or other) effects that reduce cochlear gain. Such effects were avoided in the experiment of this study by using maximally-compact stimuli.

MeSH terms

Female
Humans
Male
Models, Biological*
Pitch Perception / physiology*
Sound Localization / physiology*

Grants and funding

This work was supported by Austrian Science Fund, http://www.fwf.ac.at, grant numbers I 1362-N30 and Y 551-N13 (received the funding: TN and PB). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.