The aim of this study was to demonstrate that the cross-modal priming effect is perceptual and therefore consistent with the idea that knowledge is modality dependent. We used a two-way cross-modal priming paradigm in two experiments. These experiments were constructed on the basis of a two-phase priming paradigm. In the study phase of Experiment 1, participants had to categorize auditory primes as "animal" or "artifact". In the test phase, they had to perform the same categorization task with visual targets which corresponded either to the auditory primes presented in the study phase (old items) or to new stimuli (new items). To demonstrate the perceptual nature of the cross-modal priming effect, half of the auditory primes were presented with a visual mask (old-masked items). In the second experiment, the visual stimuli were used as primes and the auditory stimuli as targets, and half of the visual primes were presented with an auditory mask (a white noise). We hypothesized that if the cross-modal priming effect results from an activation of modality-specific representations, then the mask should interfere with the priming effect. In both experiments, the results corroborated our predictions. In addition, we observed a cross-modal priming effect from pictures to sounds in a long-term paradigm for the first time.