Integrating Vision-Language Models for Accelerated High-Throughput Nutrition Screening

Adv Sci (Weinh). 2024 Sep;11(34):e2403578. doi: 10.1002/advs.202403578. Epub 2024 Jul 8.

Abstract

Addressing the critical need for swift and precise nutritional profiling in healthcare and in food industry, this study pioneers the integration of vision-language models (VLMs) with chemical analysis techniques. A cutting-edge VLM is unveiled, utilizing the expansive UMDFood-90k database, to significantly improve the speed and accuracy of nutrient estimation processes. Demonstrating a macro-AUCROC of 0.921 for lipid quantification, the model exhibits less than 10% variance compared to traditional chemical analyses for over 82% of the analyzed food items. This innovative approach not only accelerates nutritional screening by 36.9% when tested amongst students but also sets a new benchmark in the precision of nutritional data compilation. This research marks a substantial leap forward in food science, employing a blend of advanced computational models and chemical validation to offer a rapid, high-throughput solution for nutritional analysis.

Keywords: food analysis; high‐throughput screening; machine learning; precision nutrition; vision‐language model.

MeSH terms

  • Food Analysis / methods
  • High-Throughput Screening Assays* / methods
  • Humans
  • Nutrition Assessment