Sentiment analysis has been increasingly used to analyze online social media data such as tweets and health forum posts. However, previous studies often adopted existing, general-purpose sentiment analyzers developed in non-healthcare domains, without assessing their validity and without customizing them for the specific study context. In this work, we empirically evaluated three general-purpose sentiment analyzers popularly used in previous studies (Stanford Core NLP Sentiment Analysis, TextBlob, and VADER), based on two online health datasets and a general-purpose dataset as the baseline. We illustrate that none of these general-purpose sentiment analyzers were able to produce satisfactory classifications of sentiment polarity. Further, these sentiment analyzers generated inconsistent results when applied to the same dataset, and their performance varies to a great extent across the two health datasets. Significant future work is therefore needed to develop context-specific sentiment analysis tools for analyzing online health data.
Keywords: Computing Methodologies; Social Media.