The existence of recurrent collateral connections between pyramidal cells within a cortical area and, in addition, reciprocal connections between connected cortical areas, is well established. In this work we analyse the properties of a tri-modular architecture of this type in which two input modules have convergent connections to a third module (which in the brain might be the next module in cortical processing or a bi-modal area receiving connections from two different processing pathways). Memory retrieval is analysed in this system which has Hebb-like synaptic modifiability in the connections and attractor states. Local activity features are stored in the intra-modular connections while the associations between corresponding features in different modules present during training are stored in the inter-modular connections. The response of the network when tested with corresponding and contradictory stimuli to the two input pathways is studied in detail. The model is solved quantitatively using techniques of statistical physics. In one type of test, a sequence of stimuli is applied, with a delay between them. It is found that if the coupling between the modules is low a regime exists in which they retain the capability to retrieve any of their stored features independently of the features being retrieved by the other modules. Although independent in this sense, the modules still influence each other in this regime through persistent modulatory currents which are strong enough to initiate recall in the whole network when only a single module is stimulated, and to raise the mean firing rates of the neurons in the attractors if the features in the different modules are corresponding. Some of these mechanisms might be useful for the description of many phenomena observed in single neuron activity recorded during short term memory tasks such as delayed match-to-sample. It is also shown that with contradictory stimulation of the two input modules the model accounts for many of the phenomena observed in the McGurk effect, in which contradictory auditory and visual inputs can lead to misperception.