#Election2020: the first public Twitter dataset on the 2020 US Presidential election

J Comput Soc Sci. 2022;5(1):1-18. doi: 10.1007/s42001-021-00117-9. Epub 2021 Apr 2.

Abstract

Credible evidence-based political discourse is a critical pillar of democracy and is at the core of guaranteeing free and fair elections. The study of online chatter is paramount, especially in the wake of important voting events like the recent November 3, 2020 U.S. Presidential election and the inauguration on January 21, 2021. Limited access to social media data is often the primary obstacle that limits our abilities to study and understand online political discourse. To mitigate this impediment and empower the Computational Social Science research community, we are publicly releasing a massive-scale, longitudinal dataset of U.S. politics- and election-related tweets. This multilingual dataset encompasses over 1.2 billion tweets and tracks all salient U.S. political trends, actors, and events from 2019 to the time of this writing. It predates and spans the entire period of the Republican and Democratic primaries, with real-time tracking of all presidential contenders on both sides of the aisle. The dataset also focuses on presidential and vice-presidential candidates, the presidential elections and the transition from the Trump administration to the Biden administration. Our dataset release is curated, documented, and will continue to track relevant events. We hope that the academic community, computational journalists, and research practitioners alike will all take advantage of our dataset to study relevant scientific and social issues, including problems like misinformation, information manipulation, conspiracies, and the distortion of online political discourse that has been prevalent in the context of recent election events in the United States. Our dataset is available at: https://github.com/echen102/us-pres-elections-2020.

Keywords: Presidential election; Social media analysis; Twitter.