A striking property of speech perception is its resilience in the face of acoustic variability (among speech sounds produced by different speakers at different times, for example). The robustness of speech perception might, in part, result from multiple, complementary representations of the input, which operate in both acoustic-phonetic feature-based and articulatory-gestural domains. Recent studies of the anatomical and functional organization of the non-human primate auditory cortical system point to multiple, parallel, hierarchically organized processing pathways that involve the temporal, parietal and frontal cortices. Functional neuroimaging evidence indicates that a similar organization might underlie speech perception in humans. These parallel, hierarchical processing 'streams', both within and across hemispheres, might operate on distinguishable, complementary types of representations and subserve complementary types of processing. Two long-opposing views of speech perception have posited a basis either in acoustic feature processing or in gestural motor processing; the view put forward here might help reconcile these positions.