This study describes the extent of agreement in classification of chest radiographs using the International Labor Organization (ILO) classification among six readers from the United States and Canada. A set of 119 radiographs was created and read by three Canadian and three US readers. The two ratings of interest were profusion (scored from 0/- to 3/+) and pleural abnormalities consistent with pneumoconiosis (scored with the ILO system, then collapsed into a yes/no). We used a number of approaches to evaluate interreader agreement on profusion and pleural changes, determining concordance, observed agreement, kappa statistic, and a new measure to approximate sensitivity and specificity. This study found that five of six readers had good fair to good agreement for pleural findings and for profusion as a dichotomous variable (> or = 1/0 vs < or = 0/1) using the kappa statistic, while a sixth reader had poor agreement. We found that concordance, expressed as percent agreement, was higher for normal radiographs than for ones that showed disease, and describe the use of the kappa statistic to control for this finding. This analysis adds to the existing literature with the use of the kappa statistic, and by presenting a new measure for "underreading" and "overreading" tendencies.