GourmetNet: Food Segmentation Using Multi-Scale Waterfall Features with Spatial and Channel Attention

Sensors (Basel). 2021 Nov 11;21(22):7504. doi: 10.3390/s21227504.

Abstract

We propose GourmetNet, a single-pass, end-to-end trainable network for food segmentation that achieves state-of-the-art performance. Food segmentation is an important problem as the first step for nutrition monitoring, food volume and calorie estimation. Our novel architecture incorporates both channel attention and spatial attention information in an expanded multi-scale feature representation using our advanced Waterfall Atrous Spatial Pooling module. GourmetNet refines the feature extraction process by merging features from multiple levels of the backbone through the two attention modules. The refined features are processed with the advanced multi-scale waterfall module that combines the benefits of cascade filtering and pyramid representations without requiring a separate decoder or post-processing. Our experiments on two food datasets show that GourmetNet significantly outperforms existing current state-of-the-art methods.

Keywords: channel attention; food segmentation; multi-scale features; semantic segmentation; spatial attention.

MeSH terms

  • Attention
  • Food
  • Image Processing, Computer-Assisted*
  • Neural Networks, Computer*