An empirical study on prediction of population health through social media

J Biomed Inform. 2019 Nov:99:103277. doi: 10.1016/j.jbi.2019.103277. Epub 2019 Sep 12.

Abstract

Public health measurement is important for government administration as it provides indicators and implications to public healthcare strategies. The measurement of health status has been traditionally conducted via surveys in the forms of pre-designed questionnaires to collect responses from targeted participants. Apart from benefits, traditional approach is costly, time-consuming, and not scalable. These limitations make a major obstacle to policy makers to develop up-to-date healthcare programs. This paper studies the use of health-related information conveyed in user-generated content from social media for prediction of health outcomes at population level. Specifically, we investigate linguistic features for analysing textual data. We propose the use of visual features learnt from deep neural networks for understanding visual data. We introduce collective social capital information from location-based social media data. We conducted extensive experiments on large-scale datasets collected from two online social networks: Foursquare and Flickr, against the task of prediction of the U.S. county health indices. Experimental results showed that visual and collective social capital data achieved comparable prediction performance and outperformed textual information. These promising results also suggest the potential of social media for health analysis at population scales.

Keywords: Healthcare; Public health; Social media.

MeSH terms

  • Data Visualization
  • Empirical Research
  • Health Status*
  • Humans
  • Neural Networks, Computer
  • Population Health / statistics & numerical data*
  • Psycholinguistics
  • Public Health
  • Social Media*
  • Surveys and Questionnaires