A parallel neural network has been proposed for processing various types of information conveyed by faces including emotion. Using functional magnetic resonance imaging (fMRI), we tested the effect of the explicit attention to the emotional expression of the faces on the neuronal activity of the face-responsive regions. Delayed match to sample procedure was adopted. Subjects were required to match the visually presented pictures with regard to the contour of the face pictures, facial identity, and emotional expressions by valence (happy and fearful expressions) and arousal (fearful and sad expressions). Contour matching of the non-face scrambled pictures was used as a control condition. The face-responsive regions that responded more to faces than to non-face stimuli were the bilateral lateral fusiform gyrus (LFG), the right superior temporal sulcus (STS), and the bilateral intraparietal sulcus (IPS). In these regions, general attention to the face enhanced the activities of the bilateral LFG, the right STS, and the left IPS compared with attention to the contour of the facial image. Selective attention to facial emotion specifically enhanced the activity of the right STS compared with attention to the face per se. The results suggest that the right STS region plays a special role in facial emotion recognition within distributed face-processing systems. This finding may support the notion that the STS is involved in social perception.