It is increasingly recognized that the human planum temporale is not a dedicated language processor, but is in fact engaged in the analysis of many types of complex sound. We propose a model of the human planum temporale as a computational engine for the segregation and matching of spectrotemporal patterns. The model is based on segregating the components of the acoustic world and matching these components with learned spectrotemporal representations. Spectrotemporal information derived from such a 'computational hub' would be gated to higher-order cortical areas for further processing, leading to object recognition and the perception of auditory space. We review the evidence for the model and specific predictions that follow from it.