Analyzing Emotion Expression in Singing via Flow Glottograms, Long-Term-Average Spectra, and Expert Listener Evaluation

J Voice. 2021 Jan;35(1):52-60. doi: 10.1016/j.jvoice.2019.08.007. Epub 2019 Sep 20.


Background: Acoustic aspects of emotional expressivity in speech have been analyzed extensively during recent decades. Emotional coloring is an important if not the most important property of sung performance, and therefore strictly controlled. Hence, emotional expressivity in singing may promote a deeper insight into vocal signaling of emotions. Furthermore, physiological voice source parameters can be assumed to facilitate the understanding of acoustical characteristics.

Method: Three highly experienced professional male singers sang scales on the vowel /ae/ or /a/ in 10 emotional colors (Neutral, Sadness, Tender, Calm, Joy, Contempt, Fear, Pride, Love, Arousal, and Anger). Sixteen voice experts classified the scales in a forced-choice listening test, and the result was compared with long-term-average spectrum (LTAS) parameters and with voice source parameters, derived from flow glottograms (FLOGG) that were obtained from inverse filtering the audio signal.

Results: On the basis of component analysis, the emotions could be grouped into four "families", Anger-Contempt, Joy-Love-Pride, Calm-Tender-Neutral and Sad-Fear. Recognition of the intended emotion families by listeners reached accuracy levels far beyond chance level. For the LTAS and FLOGG parameters, vocal loudness had a paramount influence on all. Also after partialing out this factor, some significant correlations were found between FLOGG and LTAS parameters. These parameters could be sorted into groups that were associated with the emotion families.

Conclusions: (i) Both LTAS and FLOGG parameters varied significantly with the enactment intentions of the singers. (ii) Some aspects of the voice source are reflected in LTAS parameters. (iii) LTAS parameters affect listener judgment of the enacted emotions and the accuracy of the intended emotional coloring.

Keywords: Classical tradition; Emotion families; Enacting; Loudness; Parameter groups.

MeSH terms

  • Acoustics
  • Auditory Perception
  • Emotions
  • Humans
  • Male
  • Singing*
  • Voice*