A statistical analysis of vaccine-adverse event data

BMC Med Inform Decis Mak. 2019 May 28;19(1):101. doi: 10.1186/s12911-019-0818-8.


Background: Vaccination has been one of the most successful public health interventions to date, and the U.S. FDA/CDC Vaccine Adverse Event Reporting System (VAERS) currently contains more than 500,000 reports for post-vaccination adverse events that occur after the administration of vaccines licensed in the United States. The VAERS dataset is huge, contains very large dimension nominal variables, and is complex due to multiple listing of vaccines and adverse symptoms in a single report. So far there has not been any statistical analysis conducted in attempting to identify the cross-board patterns on how all reported adverse symptoms are related to the vaccines.

Methods: For studies of the relationship between vaccines and reported adverse events, we consider a partial VAERS dataset which includes all reports filed over a period of 24 years between 1990-2013. We propose a neighboring method to process this dataset for dealing with the complications caused by multiple listing of vaccines and adverse symptoms in a single report. Then, the combined approaches based on our neighboring method and novel utilization of data visualization techniques are employed to analyze the large dimension dataset for characterization of the cross-board patterns of the relations between all reported vaccines and events.

Results: The results of our analysis indicate that those events or symptoms with overall high occurrence frequencies are positively correlated, and those most frequently occurred adverse symptoms are mostly uncorrelated or negatively correlated under different bacteria vaccines, but they are in many cases positively correlated under different virus vaccines, especially under flu vaccines. No particular patterns are shown under live vs. inactive vaccines.

Conclusions: This article identifies certain cross-board patterns of the relationship between the vaccines and the reported adverse events or symptoms. This helps for better understanding the VAERS data, and provides a useful starting point for the development of statistical models and procedures to further analyze the VAERS data.

Keywords: Bacteria vaccine; Correlation coefficient matrix; Data visualization; Inactivated vaccine; Live vaccine; Neighboring method; Virus vaccine.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Adverse Drug Reaction Reporting Systems*
  • Datasets as Topic
  • Humans
  • United States
  • Vaccination / adverse effects*
  • Vaccination / statistics & numerical data*
  • Vaccines / adverse effects*


  • Vaccines