Six histopathologists allocated 100 sections from patients with long-standing ulcerative colitis into four diagnostic categories, regular hyperplasia, reactive atypia, low-grade and high-grade dysplasia. Their allocations were analysed using kappa statistics, including Fleiss's multiple kappa for groups of observers, and agreement on specific diagnoses was explored by constructing a conditional probability matrix. The nature of their disagreements was investigated using coefficients for systematic and haphazard errors. Over the four diagnostic categories there was a wide range of pairwise agreement from a low of 49% up to 72% and kappa values were only 'fair' or 'moderate'. As expected, agreement over the two categories 'dysplasia' vs 'no dysplasia' was better, ranging from 68% to 84%, and for 'atypia present' (reactive atypia, low- and high-grade dysplasia) vs "no atypia' two pairings achieved over 90% and 11 pairings over 80% agreement. In view of its clinical importance, conditional agreement on high-grade dysplasia, pairwise agreement on this diagnosis ranged from 100% down to as low as 33%. However, most of these disagreements fell into the low-grade dysplasia category so that closer follow-up and further biopsies would still have been indicated. It is a truism that the basis for safe management is careful co-operation between clinicians and pathologists who have all the relevant facts and who know and trust one another's judgement. Thus, several aspects of the ideal diagnostic process cannot be evaluated in inter-observer studies and the element of artificiality should be borne in mind when applying the findings to diagnostic practice. Nevertheless, the low level of agreement on the diagnosis of high-grade dysplasia achieved by certain pairings of specialist pathologists is a disturbing outcome of this study. Inaccuracies should be minimized by a concensus approach and we therefore recommend referral of putative cases of dysplasia to interested pathologists for further opinions. We would also advocate that pathologists faced with appearances which are indefinite between reactive atypia and dysplasia, would do better to describe them in terms of "atypia, significance uncertain', so that closer surveillance is undertaken, rather than force them into more precise diagnostic categories which may be incorrect.