Calibrated Feature Fusion: Enhancing Few-Shot Industrial Anomaly Detection via Cross-Stage Representation Alignment

Sensors (Basel). 2026 Mar 31;26(7):2164. doi: 10.3390/s26072164.

Abstract

Few-shot industrial anomaly detection technology has received more and more attention because it does not require a large number of abnormal samples to train. Recent few-shot industrial anomaly detection methods commonly fuse multi-stage features from frozen vision transformers for anomaly scoring. However, we find that such direct fusion suffers from cross-stage representation misalignment-shallow and deep features differ significantly in scale and semantic granularity, leading to inconsistent anomaly maps and degraded localization. To address this problem, we propose Calibrated Feature Fusion (CFF), a lightweight adapter that enhances feature fusion via cross-stage representation alignment. The CFF module can be integrated into existing state-of-the-art frameworks and operates effectively in few-shot settings. Experiments on MVTec AD and VisA show that CFF consistently improves the state-of-the-art method across 1/2/4-shot settings, achieving gains of up to +1.6% AUROC and +4.1% AP in pixel-level segmentation. Notably, CFF enhances both precision and recall in four-shot scenarios. Ablation studies confirm that cross-stage alignment is key to stable multi-stage fusion.

Keywords: cross-domain adaptation; cross-stage alignment; feature fusion; few-shot learning; industrial anomaly detection; vision transformers.