Identifying plausible adverse drug reactions using knowledge extracted from the literature

J Biomed Inform. 2014 Dec;52:293-310. doi: 10.1016/j.jbi.2014.07.011. Epub 2014 Jul 19.


Pharmacovigilance involves continually monitoring drug safety after drugs are put to market. To aid this process; algorithms for the identification of strongly correlated drug/adverse drug reaction (ADR) pairs from data sources such as adverse event reporting systems or Electronic Health Records have been developed. These methods are generally statistical in nature, and do not draw upon the large volumes of knowledge embedded in the biomedical literature. In this paper, we investigate the ability of scalable Literature Based Discovery (LBD) methods to identify side effects of pharmaceutical agents. The advantage of LBD methods is that they can provide evidence from the literature to support the plausibility of a drug/ADR association, thereby assisting human review to validate the signal, which is an essential component of pharmacovigilance. To do so, we draw upon vast repositories of knowledge that has been extracted from the biomedical literature by two Natural Language Processing tools, MetaMap and SemRep. We evaluate two LBD methods that scale comfortably to the volume of knowledge available in these repositories. Specifically, we evaluate Reflective Random Indexing (RRI), a model based on concept-level co-occurrence, and Predication-based Semantic Indexing (PSI), a model that encodes the nature of the relationship between concepts to support reasoning analogically about drug-effect relationships. An evaluation set was constructed from the Side Effect Resource 2 (SIDER2), which contains known drug/ADR relations, and models were evaluated for their ability to "rediscover" these relations. In this paper, we demonstrate that both RRI and PSI can recover known drug-adverse event associations. However, PSI performed better overall, and has the additional advantage of being able to recover the literature underlying the reasoning pathways it used to make its predictions.

Keywords: Distributional semantics; Literature-based discovery; Pharmacovigilance; Predication-based semantic indexing; Reflective random indexing.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, N.I.H., Intramural

MeSH terms

  • Adverse Drug Reaction Reporting Systems
  • Algorithms
  • Biomedical Research
  • Data Mining / methods*
  • Drug-Related Side Effects and Adverse Reactions*
  • Humans
  • Natural Language Processing*
  • ROC Curve
  • Semantics*