Insights into the population structure and pan-genome of Haemophilus influenzae

Infect Genet Evol. 2019 Jan:67:126-135. doi: 10.1016/j.meegid.2018.10.025. Epub 2018 Oct 31.

Abstract

The human-restricted bacterium Haemophilus influenzae is responsible for respiratory infections in both children and adults. While colonization begins in the upper airways, it can spread throughout the respiratory tract potentially leading to invasive infections. Although the spread of H. influenzae serotype b (Hib) has been prevented by vaccination, the emergence of infections by other serotypes as well as by non-typeable isolates (NTHi) have been observed, prompting the need for novel prevention strategies. Here, we aimed to study the population structure of H. influenzae and to get some insights into its pan-genome. We studied 305H. influenzae strains, enrolling 217 publicly available genomes, as well as 88 newly sequenced H. influenzae invasive strains isolated in Portugal, spanning a 24-year period. NTHi isolates presented a core-SNP-based genetic diversity about 10-fold higher than the one observed for Hib. The analysis of key factors involved in pathogenesis, such as lipooligosaccharides, hemagglutinating pili and High Molecular Weight-adhesins, suggests that NTHi shape its virulence repertoire, either by acquisition and loss of genes or by SNP-based diversification, likely towards host immune evasion and persistence. Discreet NTHi subpopulations structures are proposed based on core-genome supported with 17 candidate genetic markers identified in the accessory genome. Additionally, this study provides two bioinformatics tools for in silico rapid identification of H. influenzae serotypes and NTHi clades previously proposed, obviating laboratory-based demanding procedures. The present study constitutes an important genomic framework that could lay way for future studies on the genetic determinants underlying invasiveness and disease and population structure of H. influenzae.

Keywords: Haemophilus influenzae; Non-typeable; Pan-genome; Pathogenesis; Whole-genome sequencing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology
  • Genetic Variation
  • Genome, Bacterial*
  • Genomics* / methods
  • Haemophilus Infections / microbiology*
  • Haemophilus influenzae / genetics*
  • Haemophilus influenzae / pathogenicity
  • Humans
  • Phylogeny
  • Polymorphism, Single Nucleotide
  • Virulence / genetics
  • Whole Genome Sequencing