Most EEG studies analysing speech production with event related brain potential (ERP) have adopted silent metalinguistic tasks or delayed or tacit picture naming in order to avoid possible artefacts during motor preparation. A central issue in the interpretation of these results is whether the processes involved in those tasks are comparable to those involved in overt speech production. In the present study we addressed a methodological issue about the integration of stimulus-aligned and response-aligned ERPs in immediate overt picture naming in comparison to delayed production, coupled with a theoretical point on the effect of word Age of Acquisition (AoA). High density EEG recordings were used and waveform analyses and spatio-temporal segmentation were combined on stimulus-aligned and response-aligned ERPs. The same sequence and duration of topographic maps appeared in the immediate and delayed production until around 350 ms after picture onset, revealing similar encoding processes until the beginning of phonological encoding, but modulations linked to word AoA were only observed in the immediate production. Considering stimulus-aligned and response-aligned ERPs together allowed to identify that a stable topography starting around 350 ms lasts 30 ms longer for late-acquired than for early-acquired words. This difference falls within the time-window of phonological encoding and its modulation can be linked to the longer production latencies for late-acquired words.