Psychophysical and physiological evidence indicates that the visual system of primates and humans has evolved a specialized processing focus moving across the visual scene. This study addresses the question of how simple networks of neuron-like elements can account for a variety of phenomena associated with this shift of selective visual attention. Specifically, we propose the following: (1) A number of elementary features, such as color, orientation, direction of movement, disparity etc. are represented in parallel in different topographical maps, called the early representation. (2) There exists a selective mapping from the early topographic representation into a more central non-topographic representation, such that at any instant the central representation contains the properties of only a single location in the visual scene, the selected location. We suggest that this mapping is the principal expression of early selective visual attention. One function of selective attention is to fuse information from different maps into one coherent whole. (3) Certain selection rules determine which locations will be mapped into the central representation. The major rule, using the conspicuity of locations in the early representation, is implemented using a so-called Winner-Take-All network. Inhibiting the selected location in this network causes an automatic shift towards the next most conspicious location. Additional rules are proximity and similarity preferences. We discuss how these rules can be implemented in neuron-like networks and suggest a possible role for the extensive back-projection from the visual cortex to the LGN.