Medical image classification requires models that effectively capture both fine-grained local patterns and global anatomical structures while maintaining computational efficiency for clinical deployment. Although state-of-the-art models such as MedMamba utilize State-Space Models (SSMs) to balance accuracy and efficiency, their sequential operations limit parallelism and increase runtime. To overcome these limitations, we propose MedSpectralNet, a lightweight Convolutional Neural Network (CNN) architecture that approximates self-attention with linear complexity to efficiently extract multi-frequency features. The model introduces a dual-stream feature extractor that processes global and local information in parallel, and a ContextGate block that adaptively fuses multi-scale representations. MedSpectralNet is evaluated across six benchmark datasets from MedMNIST (including BloodMNIST, BreastMNIST, DermaMNIST, PneumoniaMNIST, OrganCMNIST, and OrganSMNIST), MedSpectralNet achieves an average accuracy of 93.7% on OrganCMNIST and 98.0% on BloodMNIST, showing 1-4.3% relative accuracy gains when compared to larger transformer-based models. Importantly, it delivers this performance with only 8.5 million parameters, representing approximately 60% fewer parameters than MedMamba-T, which requires 14.5 million parameters. MedSpectralNet has also achieved high AUC values up to 0.999 across multiple classes, demonstrating state-of-the-art accuracy with substantially reduced computational cost and improved parallelization, which makes MedSpectralNet well-suited for real-time and resource-constrained classification-based medical applications.
Copyright: © 2026 Afrin et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.