ETCNN: Extra Tree and Convolutional Neural Network-based Ensemble Model for COVID-19 Tweets Sentiment Classification

Pattern Recognit Lett. 2022 Dec:164:224-231. doi: 10.1016/j.patrec.2022.11.012. Epub 2022 Nov 15.

Abstract

Pandemics influence people negatively and people experience fear and disappointment. With the global outspread of COVID-19, the sentiments of the general public are substantially influenced, and analyzing their sentiments could help to devise corresponding policies to alleviate negative sentiments. Often the data collected from social media platforms is unstructured leading to low classification accuracy. This study brings forward an ensemble model where the benefits of handcrafted features and automatic feature extraction are combined by machine learning and deep learning models. Unstructured data is obtained, preprocessed, and annotated using TextBlob and VADER before training machine learning models. Similarly, the efficiency of Word2Vec, TF, and TF-IDF features is also analyzed. Results reveal the better performance of the extra tree classifier when trained with TF-IDF features from TextBlob annotated data. Overall, machine learning models perform better with TF-IDF and TextBlob. The proposed model obtains superior performance using both annotation techniques with 0.97 and 0.95 scores of accuracy using TextBlob and VADER respectively with Word2Vec features. Results reveal that use of machine learning and deep learning models together with a voting criterion tends to yield better results than other machine learning models. Analysis of sentiments indicates that predominantly people possess negative sentiments regarding COVID-19.

Keywords: COVID-19; Ensemble model; Health informatics; Neuroinformatics; Sentiment analysis.