On the speaker discriminatory power asymmetry regarding acoustic-phonetic parameters and the impact of speaking style

Julio Cesar Cavalcanti; Anders Eriksson; Plinio A Barbosa

doi:10.3389/fpsyg.2023.1101187

On the speaker discriminatory power asymmetry regarding acoustic-phonetic parameters and the impact of speaking style

Front Psychol. 2023 Apr 17:14:1101187. doi: 10.3389/fpsyg.2023.1101187. eCollection 2023.

Authors

Julio Cesar Cavalcanti^{1

2}, Anders Eriksson¹, Plinio A Barbosa²

Affiliations

¹ Laboratory of Phonetics, Department of Linguistics, Stockholm University, Stockholm, Sweden.
² Institute of Language Studies, Department of Linguistics, University of Campinas, Campinas, Brazil.

Abstract

This study aimed to assess what we refer to as the speaker discriminatory power asymmetry and its forensic implications in comparisons performed in different speaking styles: spontaneous dialogues vs. interviews. We also addressed the impact of data sampling on the speaker's discriminatory performance concerning different acoustic-phonetic estimates. The participants were 20 male speakers, Brazilian Portuguese speakers from the same dialectal area. The speech material consisted of spontaneous telephone conversations between familiar individuals, and interviews conducted between each individual participant and the researcher. Nine acoustic-phonetic parameters were chosen for the comparisons, spanning from temporal and melodic to spectral acoustic-phonetic estimates. Ultimately, an analysis based on the combination of different parameters was also conducted. Two speaker discriminatory metrics were examined: Cost Log-likelihood-ratio (Cllr) and Equal Error Rate (EER) values. A general speaker discriminatory trend was suggested when assessing the parameters individually. Parameters pertaining to the temporal acoustic-phonetic class depicted the weakest performance in terms of speaker contrasting power as evidenced by the relatively higher Cllr and EER values. Moreover, from the set of acoustic parameters assessed, spectral parameters, mainly high formant frequencies, i.e., F3 and F4, were the best performing in terms of speaker discrimination, depicting the lowest EER and Cllr scores. The results appear to suggest a speaker discriminatory power asymmetry concerning parameters from different acoustic-phonetic classes, in which temporal parameters tended to present a lower discriminatory power. The speaking style mismatch also seemed to considerably impact the speaker comparison task, by undermining the overall discriminatory performance. A statistical model based on the combination of different acoustic-phonetic estimates was found to perform best in this case. Finally, data sampling has proven to be of crucial relevance for the reliability of discriminatory power assessment.

Keywords: acoustic phonetics; forensic phonetics; phonetics; speaker comparison; speech analysis.

Associated data

figshare/10.6084/m9.figshare.21571866.v1

Grants and funding

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil (CAPES)—Finance code 001, Grant 88887.635581/2021-00, and in part by the National Council for Scientific and Technological Development (CNPq)—Grant 140364/2017-0.