Synchronising movements with events in the surrounding environment is an ubiquitous aspect of everyday behaviour. Often, information about a stream of events is available across sensory modalities. While it is clear that we synchronise more accurately to auditory cues than other modalities, little is known about how the brain combines multisensory signals to produce accurately timed actions. Here, we investigate multisensory integration for sensorimotor synchronisation. We extend the prevailing linear phase correction model for movement synchronisation, describing asynchrony variance in terms of sensory, motor and timekeeper components. Then we assess multisensory cue integration, deriving predictions based on the optimal combination of event time, defined across different sensory modalities. Participants tapped in time with metronomes presented via auditory, visual and tactile modalities, under either unimodal or bimodal presentation conditions. Temporal regularity was manipulated between modalities by applying jitter to one of the metronomes. Results matched the model predictions closely for all except high jitter level conditions in audio-visual and audio-tactile combinations, where a bias for auditory signals was observed. We suggest that, in the production of repetitive timed actions, cues are optimally integrated in terms of both sensory and temporal reliability of events. However, when temporal discrepancy between cues is high they are treated independently, with movements timed to the cue with the highest sensory reliability.