Using multiple linear regression and physicochemical changes of amino acid mutations to predict antigenic variants of influenza A/H3N2 viruses

Biomed Mater Eng. 2014;24(6):3729-35. doi: 10.3233/BME-141201.

Abstract

Among human influenza viruses, strain A/H3N2 accounts for over a quarter of a million deaths annually. Antigenic variants of these viruses often render current vaccinations ineffective and lead to repeated infections. In this study, a computational model was developed to predict antigenic variants of the A/H3N2 strain. First, 18 critical antigenic amino acids in the hemagglutinin (HA) protein were recognized using a scoring method combining phi (ϕ) coefficient and information entropy. Next, a prediction model was developed by integrating multiple linear regression method with eight types of physicochemical changes in critical amino acid positions. When compared to other three known models, our prediction model achieved the best performance not only on the training dataset but also on the commonly-used testing dataset composed of 31878 antigenic relationships of the H3N2 influenza virus.

Keywords: H3N2; Influenza A virus; antigenic variant; multiple linear regression; physicochemical properties.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Antigens, Viral / chemistry*
  • Antigens, Viral / genetics*
  • Base Sequence
  • Computer Simulation
  • DNA Mutational Analysis / methods*
  • Data Interpretation, Statistical
  • Influenza A Virus, H2N2 Subtype / genetics*
  • Linear Models
  • Molecular Sequence Data
  • Regression Analysis

Substances

  • Antigens, Viral