Methods for safety signal detection in healthcare databases: a literature review

Expert Opin Drug Saf. 2017 Jun;16(6):721-732. doi: 10.1080/14740338.2017.1325463. Epub 2017 May 15.


With increasing availability, the use of healthcare databases as complementary data source for drug safety signal detection has been explored to circumvent the limitations inherent in spontaneous reporting. Areas covered: To review the methods proposed for safety signal detection in healthcare databases and their performance. Expert opinion: Fifteen different data mining methods were identified. They are based on disproportionality analysis, traditional pharmacoepidemiological designs (e.g. self-controlled designs), sequence symmetry analysis (SSA), sequential statistical testing, temporal association rules, supervised machine learning (SML), and the tree-based scan statistic. When considering the performance of these methods, the self-controlled designs, the SSA, and the SML seemed the most interesting approaches. In the perspective of routine signal detection from healthcare databases, pragmatic aspects such as the need for stakeholders to understand the method in order to be confident in the results must be considered. From this point of view, the SSA could appear as the most suitable method for signal detection in healthcare databases owing to its simple principle and its ability to provide a risk estimate. However, further developments, such as automated prioritization, are needed to help stakeholders handle the multiplicity of signals.

Keywords: Drug safety; data mining; pharmacoepidemiology; pharmacovigilance; signal detection.

Publication types

  • Comparative Study
  • Review

MeSH terms

  • Adverse Drug Reaction Reporting Systems / statistics & numerical data*
  • Databases, Factual / statistics & numerical data*
  • Drug-Related Side Effects and Adverse Reactions / epidemiology*
  • Epidemiologic Research Design
  • Humans
  • Machine Learning
  • Pharmacoepidemiology / methods