The present study contrasted the neural correlates of encoding item-context associations according to whether the contextual information was visual or auditory. Subjects (N=20) underwent fMRI scanning while studying a series of visually presented pictures, each of which co-occurred with either a visually or an auditorily presented name. The task requirement was to judge whether the name corresponded to the presented object. In a subsequent memory test subjects judged whether test pictures were studied or unstudied and, for items judged as studied, indicated the presentation modality of the associated name. Dissociable cortical regions demonstrating increased activity for visual vs. auditory trials (and vice versa) were identified. A subset of these modality-selective regions also showed modality-selective subsequent source memory effects, that is, enhanced responses on trials associated with correct modality judgments relative to those for which modality or item memory later failed. These findings constitute direct evidence for the proposal that successful encoding of a contextual feature is associated with enhanced activity in the cortical regions engaged during the on-line processing of that feature. In addition, successful encoding of visual objects within auditory contexts was associated with more extensive engagement of the hippocampus and adjacent medial temporal cortex than was the encoding of such objects within visual contexts. This raises the possibility that the encoding of across-modality item-context associations places more demands on the hippocampus than does the encoding of within-modality associations.