Pharmacovigilance on twitter? Mining tweets for adverse drug reactions

AMIA Annu Symp Proc. 2014 Nov 14;2014:924-33. eCollection 2014.


Recent research has shown that Twitter data analytics can have broad implications on public health research. However, its value for pharmacovigilance has been scantly studied - with health related forums and community support groups preferred for the task. We present a systematic study of tweets collected for 74 drugs to assess their value as sources of potential signals for adverse drug reactions (ADRs). We created an annotated corpus of 10,822 tweets. Each tweet was annotated for the presence or absence of ADR mentions, with the span and Unified Medical Language System (UMLS) concept ID noted for each ADR present. Using Cohen's kappa1, we calculated the inter-annotator agreement (IAA) for the binary annotations to be 0.69. To demonstrate the utility of the corpus, we attempted a lexicon-based approach for concept extraction, with promising success (54.1% precision, 62.1% recall, and 57.8% F-measure). A subset of the corpus is freely available at:

MeSH terms

  • Data Mining / methods*
  • Drug-Related Side Effects and Adverse Reactions*
  • Humans
  • Internet*
  • Pharmacovigilance*
  • Prescription Drugs / adverse effects


  • Prescription Drugs