Topic prediction for tobacco control based on COP9 tweets using machine learning techniques

PLoS One. 2024 Feb 15;19(2):e0298298. doi: 10.1371/journal.pone.0298298. eCollection 2024.

Abstract

The prediction of tweets associated with specific topics offers the potential to automatically focus on and understand online discussions surrounding these issues. This paper introduces a comprehensive approach that centers on the topic of "harm reduction" within the broader context of tobacco control. The study leveraged tweets from the period surrounding the ninth Conference of the Parties to review the Framework Convention on Tobacco Control (COP9) as a case study to pilot this approach. By using Latent Dirichlet Allocation (LDA)-based topic modeling, the study successfully categorized tweets related to harm reduction. Subsequently, various machine learning techniques were employed to predict these topics, achieving a prediction accuracy of 91.87% using the Random Forest algorithm. Additionally, the study explored correlations between retweets and sentiment scores. It also conducted a toxicity analysis to understand the extent to which online conversations lacked neutrality. Understanding the topics, sentiment, and toxicity of Twitter data is crucial for identifying public opinion and its formation. By specifically focusing on the topic of "harm reduction" in tweets related to COP9, the findings offer valuable insights into online discussions surrounding tobacco control. This understanding can aid policymakers in effectively informing the public and garnering public support, ultimately contributing to the successful implementation of tobacco control policies.

MeSH terms

  • Communication
  • Humans
  • Machine Learning
  • Public Opinion
  • Social Media*

Grants and funding

"This work is funded by/The authors received funding for this work from Bloomberg Philanthropies, as part of the Bloomberg Initiative to Reduce Tobacco Use. The funders played no role in the research."