To form a percept of the environment, the brain needs to solve the binding problem-inferring whether signals come from a common cause and are integrated or come from independent causes and are segregated. Behaviourally, humans solve this problem near-optimally as predicted by Bayesian causal inference; but the neural mechanisms remain unclear. Combining Bayesian modelling, electroencephalography (EEG), and multivariate decoding in an audiovisual spatial localisation task, we show that the brain accomplishes Bayesian causal inference by dynamically encoding multiple spatial estimates. Initially, auditory and visual signal locations are estimated independently; next, an estimate is formed that combines information from vision and audition. Yet, it is only from 200 ms onwards that the brain integrates audiovisual signals weighted by their bottom-up sensory reliabilities and top-down task relevance into spatial priority maps that guide behavioural responses. As predicted by Bayesian causal inference, these spatial priority maps take into account the brain's uncertainty about the world's causal structure and flexibly arbitrate between sensory integration and segregation. The dynamic evolution of perceptual estimates thus reflects the hierarchical nature of Bayesian causal inference, a statistical computation, which is crucial for effective interactions with the environment.