Objectives: Probe-based confocal laser endomicroscopy (pCLE) is an imaging technique that allows real-time in vivo histological assessment of Barrett's esophagus (BE). The objectives of this study were to create and test novel pCLE criteria for dysplastic BE (phase I), and to evaluate accuracy, interobserver variability, and learning curve in dysplasia prediction (phase II) using these criteria.
Methods: In phase I, using 50 pCLE videos, a pCLE expert and gastrointestinal pathologist formulated new BE criteria by consensus. These criteria were tested and refined in an independent set of 30 pCLE videos. In phase II, a formal training session for all assessors (three each experts/trainees) was conducted. Finally, using 75 testing videos, each video was interpreted as dysplasia (high-grade dysplasia (HGD)/cancer) vs. no dysplasia and the assessors' confidence in interpretation was noted. Interobserver agreement and accuracy (95% confidence interval (CI)) were determined for BE histology prediction.
Results: Of multiple pCLE criteria tested (phase I), only those with ≥70% sensitivity or specificity were included in the final set: epithelial surface: saw-toothed; cells: enlarged; cells: pleomorphic; glands: not equidistant; glands: unequal in size and shape; goblet cells: not easily identified. Overall accuracy in diagnosing dysplasia was 81.5% (95% CI: 77.5-81), with no difference between experts vs. non-experts. Accuracy of prediction was significantly higher when endoscopists were "confident" about their diagnosis (98% (95-99) vs. 62% (54-70), P<0.001). Accuracy of dysplasia prediction for the first 30 videos was not different from the last 45 (93 vs. 81%, P=0.51). Overall agreement of the criteria was substantial, κ=0.61 (0.53-0.69), with no difference between experts and non-experts.
Conclusions: We demonstrate the development and validation of new pCLE criteria for the prediction of HGD/cancer in BE patients. Using these criteria, this study demonstrated that overall accuracy in predicting dysplasia was high with substantial interobserver agreement. After a structured teaching session, accuracy and agreement between experienced and non-experienced observers was not different, suggesting a short learning curve.