In vivo MRI-derived measurements of human cerebral cortex thickness are providing novel insights into normal and abnormal neuroanatomy, but little is known about their reliability. We investigated how the reliability of cortical thickness measurements is affected by MRI instrument-related factors, including scanner field strength, manufacturer, upgrade and pulse sequence. Several data processing factors were also studied. Two test-retest data sets were analyzed: 1) 15 healthy older subjects scanned four times at 2-week intervals on three scanners; 2) 5 subjects scanned before and after a major scanner upgrade. Within-scanner variability of global cortical thickness measurements was <0.03 mm, and the point-wise standard deviation of measurement error was approximately 0.12 mm. Variability was 0.15 mm and 0.17 mm in average, respectively, for cross-scanner (Siemens/GE) and cross-field strength (1.5 T/3 T) comparisons. Scanner upgrade did not increase variability nor introduce bias. Measurements across field strength, however, were slightly biased (thicker at 3 T). The number of (single vs. multiple averaged) acquisitions had a negligible effect on reliability, but the use of a different pulse sequence had a larger impact, as did different parameters employed in data processing. Sample size estimates indicate that regional cortical thickness difference of 0.2 mm between two different groups could be identified with as few as 7 subjects per group, and a difference of 0.1 mm could be detected with 26 subjects per group. These results demonstrate that MRI-derived cortical thickness measures are highly reliable when MRI instrument and data processing factors are controlled but that it is important to consider these factors in the design of multi-site or longitudinal studies, such as clinical drug trials.