Accurate delineation of crop growth stages under real-world field conditions remains a long-standing challenge in computational phenotyping, particularly for wheat whose developmental phases are characterized by subtle, continuous morphological transitions and environmental noise. In this study, we propose AMFR-Net, an Adaptive Multi-Scale Feature Refinement Network tailored for fine-grained wheat stage identification using ground-level RGB imagery. Unlike conventional architectures that struggle with ambiguous inter-stage boundaries and rigid receptive structures, AMFR-Net leverages a ResNet-101 backbone augmented by a novel Adaptive Multi-Scale Attention Fusion (AMSAF) module-comprising cross-scale interaction blocks and confidence-weighted feature aggregation-to hierarchically recalibrate spatial-semantic representations. This design enables the network to adaptively amplify phenologically salient cues while suppressing irrelevant context, ensuring robust generalization under constrained annotation and deployment conditions. Evaluated on the expert-labeled CGIAR benchmark, AMFR-Net achieves state-of-the-art performance across all major metrics (Top-1 Accuracy: 89.10%; Macro-F1: 89.10%; AUC: 97.88%) and demonstrates superior discriminability in phenologically adjacent stages compared to lightweight and deep CNN baselines. Ablation studies validate the synergistic effect of multi-level attention and scale-aware refinement. The proposed framework offers a scalable, interpretable, and field-deployable solution for in-situ phenology monitoring, and sets a foundation for future integration of multimodal sensing, weak supervision, and cross-seasonal adaptation.
Keywords: AMFR-Net; deep visual recognition; edge deployment; field-based RGB imagery; growth stage classification; multi-scale attention; precision agriculture.
Copyright © 2026 Sun, Hou, Guo, Wang, Min, Zheng, Tian, Zhang, Zhang, Liu, Gao, An, Qi and Lv.