Functional magnetic resonance imaging (fMRI) can be used to detect experimental effects on brain activity across measurements. The success of such studies depends on the size of the experimental effect, the reliability of the measurements, and the number of subjects. Here, we report on the stability of fMRI measurements and provide sample size estimations needed for repeated measurement studies. Stability was quantified in terms of the within-subject standard deviation (sigma(w)) of BOLD signal changes across measurements. In contrast to correlation measures of stability, this statistic does not depend on the between-subjects variance in the sampled group. Sample sizes required for repeated measurements of the same subjects were calculated using this sigma(w). Ten healthy subjects performed a motor task on three occasions, separated by one week, while being scanned. In order to exclude training effects on fMRI stability, all subjects were trained extensively on the task. Task performance, spatial activation pattern, and group-wise BOLD signal changes were highly stable over sessions. In contrast, we found substantial fluctuations (up to half the size of the group mean activation level) in individual activation levels, both in ROIs and in voxels. Given this large degree of instability over sessions, and the fact that the amount of within-subject variation plays a crucial role in determining the success of an fMRI study with repeated measurements, improving stability is essential. In order to guide future studies, sample sizes are provided for a range of experimental effects and levels of stability. Obtaining estimates of these latter two variables is essential for selecting an appropriate number of subjects.