Quantifying the Severity of Adverse Drug Reactions Using Social Media: Network Analysis

J Med Internet Res. 2021 Oct 21;23(10):e27714. doi: 10.2196/27714.


Background: Adverse drug reactions (ADRs) affect the health of hundreds of thousands of individuals annually in the United States, with associated costs of hundreds of billions of dollars. The monitoring and analysis of the severity of ADRs is limited by the current qualitative and categorical systems of severity classification. Previous efforts have generated quantitative estimates for a subset of ADRs but were limited in scope because of the time and costs associated with the efforts.

Objective: The aim of this study is to increase the number of ADRs for which there are quantitative severity estimates while improving the quality of these severity estimates.

Methods: We present a semisupervised approach that estimates ADR severity by using social media word embeddings to construct a lexical network of ADRs and perform label propagation. We used this method to estimate the severity of 28,113 ADRs, representing 12,198 unique ADR concepts from the Medical Dictionary for Regulatory Activities.

Results: Our Severity of Adverse Events Derived from Reddit (SAEDR) scores have good correlations with real-world outcomes. The SAEDR scores had Spearman correlations of 0.595, 0.633, and -0.748 for death, serious outcome, and no outcome, respectively, with ADR case outcomes in the Food and Drug Administration Adverse Event Reporting System. We investigated different methods for defining initial seed term sets and evaluated their impact on the severity estimates. We analyzed severity distributions for ADRs based on their appearance in boxed warning drug label sections, as well as for ADRs with sex-specific associations. We found that ADRs discovered in the postmarketing period had significantly greater severity than those discovered during the clinical trial (P<.001). We created quantitative drug-risk profile (DRIP) scores for 968 drugs that had a Spearman correlation of 0.377 with drugs ranked by the Food and Drug Administration Adverse Event Reporting System cases resulting in death, where the given drug was the primary suspect.

Conclusions: Our SAEDR and DRIP scores are well correlated with the real-world outcomes of the entities they represent and have demonstrated utility in pharmacovigilance research. We make the SAEDR scores for 12,198 ADRs and the DRIP scores for 968 drugs publicly available to enable more quantitative analysis of pharmacovigilance data.

Keywords: adverse drug reactions; drug safety; machine learning; network analysis; pharmacovigilance; social media; social media for health; word embeddings.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Adverse Drug Reaction Reporting Systems
  • Drug Labeling
  • Drug-Related Side Effects and Adverse Reactions*
  • Female
  • Humans
  • Male
  • Pharmacovigilance
  • Social Media*