When dealing with natural scenes, sensory systems have to process an often messy and ambiguous flow of information. A stable perceptual organization nevertheless has to be achieved in order to guide behavior. The neural mechanisms involved can be highlighted by intrinsically ambiguous situations. In such cases, bistable perception occurs: distinct interpretations of the unchanging stimulus alternate spontaneously in the mind of the observer. Bistable stimuli have been used extensively for more than two centuries to study visual perception. Here we demonstrate that bistable perception also occurs in the auditory modality. We compared the temporal dynamics of percept alternations observed during auditory streaming with those observed for visual plaids and the susceptibilities of both modalities to volitional control. Strong similarities indicate that auditory and visual alternations share common principles of perceptual bistability. The absence of correlation across modalities for subject-specific biases, however, suggests that these common principles are implemented at least partly independently across sensory modalities. We propose that visual and auditory perceptual organization could rely on distributed but functionally similar neural competition mechanisms aimed at resolving sensory ambiguities.