Hearing a face: cross-modal speaker matching using isolated visible speech

Lawrence D Rosenblum; Nicolas M Smith; Sarah M Nichols; Steven Hale; Joanne Lee

doi:10.3758/bf03193658

Hearing a face: cross-modal speaker matching using isolated visible speech

Percept Psychophys. 2006 Jan;68(1):84-93. doi: 10.3758/bf03193658.

Authors

Lawrence D Rosenblum¹, Nicolas M Smith, Sarah M Nichols, Steven Hale, Joanne Lee

Affiliation

¹ Department of Psychology, University of California, Riverside, CA 92521, USA. rosenblu@citrus.ucr.edu

PMID: 16617832
DOI: 10.3758/bf03193658

Abstract

An experiment was performed to test whether cross-modal speaker matches could be made using isolated visible speech movement information. Visible speech movements were isolated using a point-light technique. In five conditions, subjects were asked to match a voice to one of two (unimodal) speaking point-light faces on the basis of speaker identity. Two of these conditions were designed to maintain the idiosyncratic speech dynamics of the speakers, whereas three of the conditions deleted or distorted the dynamics in various ways. Some of these conditions also equated video frames across dynamically correct and distorted movements. The results revealed generally better matching performance in the conditions that maintained the correct speech dynamics than in those conditions that did not, despite containing exactly the same video frames. The results suggest that visible speech movements themselves can support cross-modal speaker matching.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Adolescent
Adult
Face*
Facial Expression
Female
Humans
Male
Speech Perception*
Visual Perception*