MRI is the paraclinical test most widely used to support the diagnosis of multiple sclerosis (MS). We evaluated interobserver agreement in applying diagnostic criteria to MRI obtained at first presentation. Five experienced observers scored 25 sets of images consisting of unenhanced T2- and gadolinium-enhanced T1-weighted images (approximately half the sets were normal). We scored frontal, parietal, temporal, occipital, infratentorial and basal ganglia lesions and the total number of lesions on T2-weighted images; periventricular, callosal, juxtacortical and ovoid lesions and those > 5 mm in maximum diameter; contrast-enhancing and hypointense lesions. Based on a combination of imaging findings patients were classified as compatible or not compatible with MS according to composite criteria. Observer concordance was characterised by weighted kappa values (kappa) and mean average difference to the median (MADM) scores. Using the raw scores, there was poor agreement for the total number of lesions on T2-weighted images, and for occipital, oval, juxtacortical and hypointense lesions. Moderate agreement was found for frontal, callosal, basal ganglia and large lesions on T2 weighting. Good agreement was attained for parietal, temporal, infratentorial and periventricular lesions. After dichotomisation according to accepted cut-off values, most criteria performed better, especially the number of lesions on T2-weighted images (P < 0.05). Good agreement was found for the criteria of Paty and Fazekas and moderate agreement for those of Barkhof. While experienced observers may not agree on the total number of lesions, they show quite good agreement for commonly used cut-off points and elements in the composite criteria. This validates the use of MRI in the diagnosis of MS, and the use of dichotomised and composite criteria.