Animal studies aimed at understanding influenza virus mutations that change host specificity to adapt to replication in mammalian hosts are necessarily limited in sample numbers due to high cost and safety requirements. As a safe, higher-throughput alternative, we explore the possibility of using readily available passage bias data obtained mostly from seasonal H1 and H3 influenza strains that were differentially grown in mammalian (MDCK) and avian cells (eggs). Using a statistical approach over 80,000 influenza hemagglutinin sequences with passage information, we found that passage bias sites are most commonly found in three regions: (i) the globular head domain around the receptor binding site, (ii) the region that undergoes pH-dependent structural changes and (iii) the unstructured N-terminal region harbouring the signal peptide. Passage bias sites were consistent among different passage cell types as well as between influenza A subtypes. We also find epistatic interactions of site pairs supporting the notion of host-specific dependency of mutations on virus genomic background. The sites identified from our large-scale sequence analysis substantially overlap with known host adaptation sites in the WHO H5N1 genetic changes inventory suggesting information from passage bias can provide candidate sites for host specificity changes to aid in risk assessment for emerging strains.
Keywords: adaptation; hemagglutinin; host specificity; influenza; mutations; pandemic; passage.