Integration of information from face and voice plays a central role in our social interactions. It has been mostly studied in the context of audiovisual speech perception: integration of affective or identity information has received comparatively little scientific attention. Here, we review behavioural and neuroimaging studies of face-voice integration in the context of person perception. Clear evidence for interference between facial and vocal information has been observed during affect recognition or identity processing. Integration effects on cerebral activity are apparent both at the level of heteromodal cortical regions of convergence, particularly bilateral posterior superior temporal sulcus (pSTS), and at 'unimodal' levels of sensory processing. Whether the latter reflects feedback mechanisms or direct crosstalk between auditory and visual cortices is as yet unclear.