Big Data Usage Patterns in the Health Care Domain: A Use Case Driven Approach Applied to the Assessment of Vaccination Benefits and Risks. Contribution of the IMIA Primary Healthcare Working Group

Yearb Med Inform. 2014 Aug 15;9(1):27-35. doi: 10.15265/IY-2014-0016.


Background: Generally benefits and risks of vaccines can be determined from studies carried out as part of regulatory compliance, followed by surveillance of routine data; however there are some rarer and more long term events that require new methods. Big data generated by increasingly affordable personalised computing, and from pervasive computing devices is rapidly growing and low cost, high volume, cloud computing makes the processing of these data inexpensive.

Objective: To describe how big data and related analytical methods might be applied to assess the benefits and risks of vaccines.

Method: We reviewed the literature on the use of big data to improve health, applied to generic vaccine use cases, that illustrate benefits and risks of vaccination. We defined a use case as the interaction between a user and an information system to achieve a goal. We used flu vaccination and pre-school childhood immunisation as exemplars.

Results: We reviewed three big data use cases relevant to assessing vaccine benefits and risks: (i) Big data processing using crowdsourcing, distributed big data processing, and predictive analytics, (ii) Data integration from heterogeneous big data sources, e.g. the increasing range of devices in the "internet of things", and (iii) Real-time monitoring for the direct monitoring of epidemics as well as vaccine effects via social media and other data sources.

Conclusions: Big data raises new ethical dilemmas, though its analysis methods can bring complementary real-time capabilities for monitoring epidemics and assessing vaccine benefit-risk balance.

Keywords: Population surveillance; computerized; immunization; information science; medical record systems; public health.

Publication types

  • Review

MeSH terms

  • Computational Biology*
  • Data Mining*
  • Databases, Factual*
  • Epidemics
  • Humans
  • Medical Informatics
  • Medical Records Systems, Computerized
  • Population Surveillance / methods*
  • Vaccination* / adverse effects
  • Vaccination* / statistics & numerical data