We are continuously surrounded by a noisy and ever-changing environment. Instead of analyzing all the elements in a scene, our visual system has the ability to compress an enormous amount of visual information into ensemble representations, such as perceiving a forest instead of every single tree. Still, it is unclear why such complex scenes appear to be the same from moment to moment despite fluctuations, noise, and discontinuities in retinal images. The general effects of change blindness are usually thought to stabilize scene perception, making us unaware of minor inconsistencies between scenes. Here, we propose an alternative, that stable scene perception is actively achieved by the visual system through global serial dependencies: the appearance of scene gist is sequentially dependent on the gist perceived in previous moments. To test this hypothesis, we used summary statistical information as a proxy for "gist" level, global information in a scene. We found evidence for serial dependence in summary statistical representations. Furthermore, we show that this kind of serial dependence occurs at the ensemble level, where local elements are already merged into global representations. Taken together, our results provide a mechanism through which serial dependence can promote the apparent consistency of scenes over time.