While differences in human virulence have been reported across nontyphoidal Salmonella (NTS) serovars and associated subtypes, a rational and scalable approach to identify Salmonella subtypes with differential ability to cause human diseases is not available. Here, we used NTS serovar Saintpaul (S. Saintpaul) as a model to determine if metadata and associated whole-genome sequence (WGS) data in the NCBI Pathogen Detection (PD) database can be used to identify (i) subtypes with differential likelihoods of causing human diseases and (ii) genes and single nucleotide polymorphisms (SNPs) potentially responsible for such differences. S. Saintpaul SNP clusters (n = 211) were assigned different epidemiology types (epi-types) based on statistically significant over- or underrepresentation of human clinical isolates, including human associated (HA; n = 29), non-human associated (NHA; n = 23), and other (n = 159). Comparative genomic analyses identified 384 and 619 genes overrepresented among isolates in 5 HA and 4 NHA SNP clusters most significantly associated with the respective isolation source. These genes included 5 HA-associated virulence genes previously reported to be present on Gifsy-1/Gifsy-2 prophages. Additionally, premature stop codons in 3 and 7 genes were overrepresented among the selected HA and NHA SNP clusters, respectively. Tissue culture experiments with strains representing 4 HA and 3 NHA SNP clusters did not reveal evidence for enhanced invasion or intracellular survival for HA strains. However, the presence of sodCI (encoding a superoxide dismutase), found in 4 HA and 1 NHA SNP clusters, was positively correlated with intracellular survival in macrophage-like cells. Post hoc analyses also suggested a possible difference in intracellular survival among S. Saintpaul lineages. IMPORTANCE Not all Salmonella isolates are equally likely to cause human disease, and Salmonella control strategies may unintentionally focus on serovars and subtypes with high prevalence in source populations but are rarely associated with human clinical illness. We describe a framework leveraging WGS data in the NCBI PD database to identify Salmonella subtypes over- and underrepresented among human clinical cases. While we identified genomic signatures associated with HA/NHA SNP clusters, tissue culture experiments failed to identify consistent phenotypic characteristics indicative of enhanced human virulence of HA strains. Our findings illustrate the challenges of defining hypo- and hypervirulent S. Saintpaul and potential limitations of phenotypic assays when evaluating human virulence, for which in vivo experiments are essential. Identification of sodCI, an HA-associated virulence gene associated with enhanced intracellular survival, however, illustrates the potential of the framework and is consistent with prior work identifying specific genomic features responsible for enhanced or reduced virulence of nontyphoidal Salmonella.
Keywords: SNP clusters; comparative genomic analyses; human virulence; intracellular survival; invasion; nontyphoidal Salmonella; pathogen detection; phenotypic characterization; regulatory policy; serovar Saintpaul.