Food Volume Estimation Based on Deep Learning View Synthesis from a Single Depth Map

Frank P-W Lo; Yingnan Sun; Jianing Qiu; Benny Lo

doi:10.3390/nu10122005

Food Volume Estimation Based on Deep Learning View Synthesis from a Single Depth Map

Nutrients. 2018 Dec 18;10(12):2005. doi: 10.3390/nu10122005.

Authors

Frank P-W Lo¹, Yingnan Sun², Jianing Qiu³, Benny Lo⁴

Affiliations

¹ Hamlyn Centre, Department of Surgery and Cancer, Imperial College London, London SW7 2AZ, UK. po.lo15@imperial.ac.uk.
² Hamlyn Centre, Department of Computing, Imperial College London, London SW7 2AZ, UK. y.sun16@imperial.ac.uk.
³ Hamlyn Centre, Department of Computing, Imperial College London, London SW7 2AZ, UK. jianing.qiu17@imperial.ac.uk.
⁴ Hamlyn Centre, Department of Surgery and Cancer, Imperial College London, London SW7 2AZ, UK. benny.lo@imperial.ac.uk.

Abstract

An objective dietary assessment system can help users to understand their dietary behavior and enable targeted interventions to address underlying health problems. To accurately quantify dietary intake, measurement of the portion size or food volume is required. For volume estimation, previous research studies mostly focused on using model-based or stereo-based approaches which rely on manual intervention or require users to capture multiple frames from different viewing angles which can be tedious. In this paper, a view synthesis approach based on deep learning is proposed to reconstruct 3D point clouds of food items and estimate the volume from a single depth image. A distinct neural network is designed to use a depth image from one viewing angle to predict another depth image captured from the corresponding opposite viewing angle. The whole 3D point cloud map is then reconstructed by fusing the initial data points with the synthesized points of the object items through the proposed point cloud completion and Iterative Closest Point (ICP) algorithms. Furthermore, a database with depth images of food object items captured from different viewing angles is constructed with image rendering and used to validate the proposed neural network. The methodology is then evaluated by comparing the volume estimated by the synthesized 3D point cloud with the ground truth volume of the object items.

Keywords: 3d reconstruction; deep learning; dietary assessment; image rendering; mhealth; view synthesis; volume estimation.

MeSH terms

Algorithms*
Deep Learning*
Diet*
Energy Intake
Humans
Imaging, Three-Dimensional
Nutrition Assessment*
Portion Size*

Grants and funding

OPP1171395/Melinda Gates Foundation - Innovative Passive Dietary Monitoring Project