Algorithm-based mapping of products in a branded Canadian food and beverage database to their equivalents in Health Canada's Canadian Nutrient File

Front Nutr. 2023 Feb 16:9:1013516. doi: 10.3389/fnut.2022.1013516. eCollection 2022.


Introduction: There is increasing recognition of the value of linking food sales databases to national food composition tables for population nutrition research.

Objectives: Expanding upon automated and manual database mapping approaches in the literature, our aim was to match 1,179 food products in the Canadian data subset of Euromonitor International's Passport Nutrition to their closest respective equivalents in Health Canada's Canadian Nutrient File (CNF).

Methods: Matching took place in two major steps. First, an algorithm based on thresholds of maximal nutrient difference (between Euromonitor and CNF foods) and fuzzy matching was executed to offer match options. If a nutritionally appropriate match was available among the algorithm suggestions, it was selected. When the suggested set contained no nutritionally sound matches, the Euromonitor product was instead manually matched to a CNF food or deemed unmatchable, with the unique addition of expert validation to maximize meticulousness in matching. Both steps were independently performed by at least two team members with dietetics expertise.

Results: Of 1,111 Euromonitor products run through the algorithm, an accurate CNF match was offered for 65% of them; missing or zero-calorie data precluded 68 products from being run in the algorithm. Products with 2 or more algorithm-suggested CNF matches had higher match accuracy than those with one (71 vs. 50%, respectively). Overall, inter-rater agreement (reliability) rates were robust for matches chosen among algorithm options (51%) and even higher regarding whether manual selection would be required (71%); among manually selected CNF matches, reliability was 33%. Ultimately, 1,152 (98%) Euromonitor products were matched to a CNF equivalent.

Conclusion: Our reported matching process successfully bridged a food sales database's products to their respective CNF matches for use in future nutritional epidemiological studies of branded foods sold in Canada. Our team's novel utilization of dietetics expertise aided in match validation at both steps, ensuring rigor and quality of resulting match selections.

Keywords: Canada; Canadian Nutrient File; database mapping; food composition tables (FCTs); food supply; fuzzy matching; nutritional surveillance and monitoring; public health nutrition.

Grants and funding

Authors acknowledge MJ’s funding from the Canadian Institutes of Health Research (CIHR #428028), Banting Foundation Discovery Grant (#2019-1406), and the Canada Research Chair program (#950–233168) for this study. SG received a Yale Fox International Fellowship that also supported this work.