Background: The Food and Drug Administration (FDA) in the United States and the European Medicines Agency (EMA) have recognized social media as a new data source to strengthen their activities regarding drug safety. Objective: Our objective in the ADR-PRISM project was to provide text mining and visualization tools to explore a corpus of posts extracted from social media. We evaluated this approach on a corpus of 21 million posts from five patient forums, and conducted a qualitative analysis of the data available on methylphenidate in this corpus. Methods: We applied text mining methods based on named entity recognition and relation extraction in the corpus, followed by signal detection using proportional reporting ratio (PRR). We also used topic modeling based on the Correlated Topic Model to obtain the list of the matics in the corpus and classify the messages based on their topics. Results: We automatically identified 3443 posts about methylphenidate published between 2007 and 2016, among which 61 adverse drug reactions (ADR) were automatically detected. Two pharmacovigilance experts evaluated manually the quality of automatic identification, and a f-measure of 0.57 was reached. Patient's reports were mainly neuro-psychiatric effects. Applying PRR, 67% of the ADRs were signals, including most of the neuro-psychiatric symptoms but also palpitations. Topic modeling showed that the most represented topics were related to Childhood and Treatment initiation, but also Side effects. Cases of misuse were also identified in this corpus, including recreational use and abuse. Conclusion: Named entity recognition combined with signal detection and topic modeling have demonstrated their complementarity in mining social media data. An in-depth analysis focused on methylphenidate showed that this approach was able to detect potential signals and to provide better understanding of patients' behaviors regarding drugs, including misuse.
Keywords: data mining; drug misuse; drug-related side effects and adverse reactions; methylphenidate; natural language processing; pharmacovigilance; social media.