Fast Bayesian scan statistics for multivariate event detection and visualization

Stat Med. 2011 Feb 28;30(5):455-69. doi: 10.1002/sim.3881.

Abstract

The multivariate Bayesian scan statistic (MBSS) is a recently proposed, general framework for event detection and characterization in multivariate space-time data. MBSS integrates prior information and observations from multiple data streams in a Bayesian framework, computing the posterior probability of each type of event in each space-time region. MBSS has been shown to have many advantages over previous event detection approaches, including improved timeliness and accuracy of detection, easy interpretation and visualization of results, and the ability to model and accurately differentiate between multiple event types. This work extends the MBSS framework to enable detection and visualization of irregularly shaped clusters in multivariate data, by defining a hierarchical prior over all subsets of locations. While a naive search over the exponentially many subsets would be computationally infeasible, we demonstrate that the total posterior probability that each location has been affected can be efficiently computed, enabling rapid detection and visualization of irregular clusters. We compare the run time and detection power of this 'Fast Subset Sums' method to our original MBSS approach (assuming a uniform prior over circular regions) on semi-synthetic outbreaks injected into real-world Emergency Department data from Allegheny County, Pennsylvania. We demonstrate substantial improvements in spatial accuracy and timeliness of detection, while maintaining the scalability and fast run time of the original MBSS method.

Publication types

  • Comparative Study
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Biosurveillance / methods*
  • Computer Graphics*
  • Computer Simulation
  • Cough / epidemiology
  • Disease Outbreaks / statistics & numerical data
  • Dyspnea / epidemiology
  • Emergency Service, Hospital / statistics & numerical data
  • Humans
  • Likelihood Functions
  • Models, Statistical*
  • Nausea / epidemiology
  • Pennsylvania / epidemiology
  • Poisson Distribution
  • Probability
  • Space-Time Clustering
  • Vomiting / epidemiology