Observer variability in the pulmonary examination was assessed by having four blindfolded observers (two medical students and two pulmonary physicians) twice examine 31 patients with abnormal pulmonary findings. Examiners were consistent in the repetitive detection of pulmonary abnormalities in 74-89% of the examinations; conversely, 11-26% of the time they disagreed with themselves. Although pulmonary specialists recorded fewer (55% of observations) abnormal findings than did medical students (74%), they were significantly (p = 0.008) less self-consistent than were the students. There was no clear trend in agreement between examiners (kappa = 0.20-0.49). Each examiner's findings were compared with those of physicians specially trained in pulmonary examination. Dichotomous variables (wheezes, crackles, rubs) were more reliably detected (kappa = 0.30-0.70) than graded variables (tympany, dullness, breath sound intensity), where kappa = 0.16-0.43. The authors suggest that dichotomous variables deserve greatest clinical reliance; that time in training, alone, does not improve clinical performance; and that there is a disconcertingly large amount of inter- and intraobserver disagreement in this fundamental clinical task.