As WGS is increasingly used by food industry to characterize pathogen isolates, users are challenged by the variety of analysis approaches available, ranging from methods that require extensive bioinformatics expertise to commercial software packages. This study aimed to assess the impact of analysis pipelines (i.e., different hqSNP pipelines, a cg/wgMLST pipeline) and the reference genome selection on analysis results (i.e., hqSNP and allelic differences as well as tree topologies) and conclusion drawn. For these comparisons, whole genome sequences were obtained for 40 Listeria monocytogenes isolates collected over 18 years from a cold-smoked salmon facility and 2 other isolates obtained from different facilities as part of academic research activities; WGS data were analyzed with three hqSNP pipelines and two MLST pipelines. After initial clustering using a k-mer based approach, hqSNP pipelines were run using two types of reference genomes: (i) closely related closed genomes ("closed references") and (ii) high-quality de novo assemblies of the dataset isolates ("draft references"). All hqSNP pipelines identified similar hqSNP difference ranges among isolates in a given cluster; use of different reference genomes showed minimal impacts on hqSNP differences identified between isolate pairs. Allelic differences obtained by wgMLST showed similar ranges as hqSNP differences among isolates in a given cluster; cgMLST consistently showed fewer differences than wgMLST. However, phylogenetic trees and dendrograms, obtained based on hqSNP and cg/wgMLST data, did show some incongruences, typically linked to clades supported by low bootstrap values in the trees. When a hqSNP cutoff was used to classify isolates as "related" or "unrelated," use of different pipelines yielded a considerable number of discordances; this finding supports that cut-off values are valuable to provide a starting point for an investigation, but supporting and epidemiological evidence should be used to interpret WGS data. Overall, our data suggest that cgMLST-based data analyses provide for appropriate subtype differentiation and can be used without the need for preliminary data analyses (e.g., k-mer based clustering) or external closed reference genomes, simplifying data analyses needs. hqSNP or wgMLST analyses can be performed on the isolate clusters identified by cgMLST to increase the precision on determining the genomic similarity between isolates.
Keywords: CFSAN pipeline; Listeria monocytogenes (L. monocytogenes); Lyve-SET; core genome MLST (cgMLST); high quality single nucleotide polymorphism (hqSNP); smoked salmon; whole genome MLST (wgMLST); whole genome sequence (WGS).
An Assessment of Different Genomic Approaches for Inferring Phylogeny of Listeria monocytogenes.Front Microbiol. 2017 Nov 29;8:2351. doi: 10.3389/fmicb.2017.02351. eCollection 2017. Front Microbiol. 2017. PMID: 29238330 Free PMC article.
A Comparative Analysis of the Lyve-SET Phylogenomics Pipeline for Genomic Epidemiology of Foodborne Pathogens.Front Microbiol. 2017 Mar 13;8:375. doi: 10.3389/fmicb.2017.00375. eCollection 2017. Front Microbiol. 2017. PMID: 28348549 Free PMC article.
Whole genome sequencing analyses of Listeria monocytogenes that persisted in a milkshake machine for a year and caused illnesses in Washington State.BMC Microbiol. 2017 Jun 15;17(1):134. doi: 10.1186/s12866-017-1043-1. BMC Microbiol. 2017. PMID: 28619007 Free PMC article.
Development and Implementation of Whole Genome Sequencing-Based Typing Schemes for Clostridioides difficile.Front Public Health. 2019 Oct 24;7:309. doi: 10.3389/fpubh.2019.00309. eCollection 2019. Front Public Health. 2019. PMID: 31709221 Free PMC article. Review.
Whole genome sequencing uses for foodborne contamination and compliance: Discovery of an emerging contamination event in an ice cream facility using whole genome sequencing.Infect Genet Evol. 2019 Sep;73:214-220. doi: 10.1016/j.meegid.2019.04.026. Epub 2019 Apr 27. Infect Genet Evol. 2019. PMID: 31039448 Review.
Cited by 2 articles
Predominance of Distinct Listeria Innocua and Listeria Monocytogenes in Recurrent Contamination Events at Dairy Processing Facilities.Microorganisms. 2020 Feb 10;8(2):234. doi: 10.3390/microorganisms8020234. Microorganisms. 2020. PMID: 32050536 Free PMC article.
Evolution of Listeria monocytogenes in a Food Processing Plant Involves Limited Single-Nucleotide Substitutions but Considerable Diversification by Gain and Loss of Prophages.Appl Environ Microbiol. 2020 Mar 2;86(6):e02493-19. doi: 10.1128/AEM.02493-19. Print 2020 Mar 2. Appl Environ Microbiol. 2020. PMID: 31900305 Free PMC article.