The interobserver variation among three experienced endoscopists in the endoscopic diagnosis and grading of reflux esophagitis was investigated in 150 dyspeptic patients. The interobserver variation was analyzed with kappa statistics to correct for the extent of agreement expected by chance alone. The observers diagnosed esophagitis in 22.7%, 32.7%, and 35.3% of the patients, respectively (p < 0.0002). Kappa values for grade-1 esophagitis varied from 0.34 to 0.47, a level generally considered to signify poor agreement, and despite partial agreement on the diagnosis in the individual patient there was almost complete disagreement on the features used to characterize grade 1. Kappa values for diagnosing erosive esophagitis (grades 2-4) were 0.68-0.79. Considering all three observers and all grades of esophagitis (grades 0-4) the overall chance-corrected agreement was 0.55. In patients with low-grade esophagitis without reflux-like dyspepsia and when the observers expressed uncertainty in the diagnosis, the agreement rates were particularly poor. Due to a large chance-corrected interobserver variation, the endoscopic diagnosis grade 1 esophagitis is not reliable and thus may be problematic as a selection criterion for clinical trials. Interobserver variation on the presence of erosive/ulcerative esophagitis is acceptable and comparable to the level for peptic ulcer.