InferBERT: A Transformer-Based Causal Inference Framework for Enhancing Pharmacovigilance

Xingqiao Wang; Xiaowei Xu; Weida Tong; Ruth Roberts; Zhichao Liu

doi:10.3389/frai.2021.659622

InferBERT: A Transformer-Based Causal Inference Framework for Enhancing Pharmacovigilance

Front Artif Intell. 2021 May 26:4:659622. doi: 10.3389/frai.2021.659622. eCollection 2021.

Authors

Xingqiao Wang¹, Xiaowei Xu¹, Weida Tong², Ruth Roberts^{3

4}, Zhichao Liu²

Affiliations

¹ Department of Information Science, University of Arkansas at Little Rock, Little Rock, AR, United States.
² FDA/National Center for Toxicological Research, Jefferson, AR, United States.
³ ApconiX Ltd, Alderley Park, Alderley Edge, United Kingdom.
⁴ Department of Biosciences, University of Birmingham, Birmingham, United Kingdom.

Abstract

Background: T ransformer-based language models have delivered clear improvements in a wide range of natural language processing (NLP) tasks. However, those models have a significant limitation; specifically, they cannot infer causality, a prerequisite for deployment in pharmacovigilance, and health care. Therefore, these transformer-based language models should be developed to infer causality to address the key question of the cause of a clinical outcome. Results: In this study, we propose an innovative causal inference model-InferBERT, by integrating the A Lite Bidirectional Encoder Representations from Transformers (ALBERT) and Judea Pearl's Do-calculus to establish potential causality in pharmacovigilance. Two FDA Adverse Event Reporting System case studies, including Analgesics-related acute liver failure and Tramadol-related mortalities, were employed to evaluate the proposed InferBERT model. The InferBERT model yielded accuracies of 0.78 and 0.95 for identifying Analgesics-related acute liver failure and Tramadol-related death cases, respectively. Meanwhile, the inferred causes of the two clinical outcomes, (i.e. acute liver failure and death) were highly consistent with clinical knowledge. Furthermore, inferred causes were organized into a causal tree using the proposed recursive do-calculus algorithm to improve the model's understanding of causality. Moreover, the high reproducibility of the proposed InferBERT model was demonstrated by a robustness assessment. Conclusion: The empirical results demonstrated that the proposed InferBERT approach is able to both predict clinical events and to infer their causes. Overall, the proposed InferBERT model is a promising approach to establish causal effects behind text-based observational data to enhance our understanding of intrinsic causality. Availability and implementation: The InferBERT model and preprocessed FAERS data sets are available on GitHub at https://github.com/XingqiaoWang/DeepCausalPV-master.

Keywords: artificial intelligence; causal inference; language models; natural language processing; pharmacovigilance.