Biases in using social media data for public health surveillance: A scoping review

Int J Med Inform. 2022 Aug;164:104804. doi: 10.1016/j.ijmedinf.2022.104804. Epub 2022 May 23.


Objectives: A landscape scan of the methods that are used to either assess or mitigate biases when using social media data for public health surveillance, through a scoping review.

Materials and methods: Following best practices, we searched two literature databases (i.e., PubMed and Web of Science) and covered literature published up to July 2021. Through two rounds of screening (i.e., title/abstract screening, and then full-text screening), we extracted study objectives, analysis methods, and the methods used to assess or address the different biases from the eligible articles.

Results: We identified a total of 2,856 articles from the two databases. After the screening processes, we extracted and synthesized 20 studies that either assessed or mitigated biases when leveraging social media data for public health surveillance. Researchers have tried to assess or address several different types of biases such as demographic bias, keyword bias, and platform bias. In particular, we found 11 studies that tried to measure the reliability of the research findings from social media data by comparing them with other data sources.

Discussion and conclusion: We synthesized the types of biases and the methods used to assess or address the biases in studies that use social media data for public health surveillance. We found very few studies, despite the large number of publications using social media data, considered the various bias issues that are present from data collection to analysis methods. Overlooking bias can distort the study results and lead to unintended consequences, especially in the field of public health surveillance. These research gaps warrant further investigations more systematically. Strategies from other fields for addressing biases can be introduced for future public health surveillance systems that use social media data.

Keywords: Bias; Public health surveillance; Social media.

Publication types

  • Review
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Bias
  • Data Collection
  • Humans
  • Public Health
  • Public Health Surveillance
  • Reproducibility of Results
  • Social Media*