The pattern of retinal binocular disparities acquired by a fixating visual system depends on both the depth structure of the scene and the viewing geometry. This paper treats the problem of interpreting the disparity pattern in terms of scene structure without relying on estimates of fixation position from eye movement control and proprioception mechanisms. We propose a sequential decomposition of this interpretation process into disparity correction, which is used to compute three-dimensional structure up to a relief transformation, and disparity normalization, which is used to resolve the relief ambiguity to obtain metric structure. We point out that the disparity normalization stage can often be omitted, since relief transformations preserve important properties such as depth ordering and coplanarity. Based on this framework we analyse three previously proposed computational models of disparity processing; the Mayhew and Longuet-Higgins model, the deformation model and the polar angle disparity model. We show how these models are related, and argue that none of them can account satisfactorily for available psychophysical data. We therefore propose an alternative model, regional disparity correction. Using this model we derive predictions for a number of experiments based on vertical disparity manipulations, and compare them to available experimental data. The paper is concluded with a summary and a discussion of the possible architectures and mechanisms underling stereopsis in the human visual system.