The functional neuroanatomy of speech perception has been difficult to characterize. Part of the difficulty, we suggest, stems from the fact that the neural systems supporting 'speech perception' vary as a function of the task. Specifically, the set of cognitive and neural systems involved in performing traditional laboratory speech perception tasks, such as syllable discrimination or identification, only partially overlap those involved in speech perception as it occurs during natural language comprehension. In this review, we argue that cortical fields in the posterior-superior temporal lobe, bilaterally, constitute the primary substrate for constructing sound-based representations of speech, and that these sound-based representations interface with different supramodal systems in a task-dependent manner. Tasks that require access to the mental lexicon (i.e. accessing meaning-based representations) rely on auditory-to-meaning interface systems in the cortex in the vicinity of the left temporal-parietal-occipital junction. Tasks that require explicit access to speech segments rely on auditory-motor interface systems in the left frontal and parietal lobes. This auditory-motor interface system also appears to be recruited in phonological working memory.