Stock price movement prediction based on Stocktwits investor sentiment using FinBERT and ensemble SVM

PeerJ Comput Sci. 2023 Jun 7:9:e1403. doi: 10.7717/peerj-cs.1403. eCollection 2023.

Abstract

Investor sentiment plays a crucial role in the stock market, and in recent years, numerous studies have aimed to predict future stock prices by analyzing market sentiment obtained from social media or news. This study investigates the use of investor sentiment from social media, with a focus on Stocktwits, a social media platform for investors. However, using investor sentiment on Stocktwits to predict stock price movements may be challenging due to a lack of user-initiated sentiment data and the limitations of existing sentiment analyzers, which may inaccurately classify neutral comments. To overcome these challenges, this study proposes an alternative approach using FinBERT, a pre-trained language model specifically designed to analyze the sentiment of financial text. This study proposes an ensemble support vector machine for improving the accuracy of stock price movement predictions. Then, it predicts the future movement of SPDR S&P 500 Index Exchange Traded Funds using the rolling window approach to prevent look-ahead bias. Through comparing various techniques for generating sentiment, our results show that using the FinBERT model for sentiment analysis yields the best results, with an F1-score that is 4-5% higher than other techniques. Additionally, the proposed ensemble support vector machine improves the accuracy of stock price movement predictions when compared to the original support vector machine in a series of experiments.

Keywords: FinBERT; Financial; Machine learning; SPY; SVM; Sentiment analysis; Stock price prediction; Stocktwits.

Associated data

  • figshare/10.6084/m9.figshare.20237736.v1

Grants and funding

This work was supported by the Kyushu Institute of Technology—National Taiwan University of Science and Technology Joint Research Program, under Grant Kyutech-NTUST-109-03. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.